perl script prints parts of strings in the wrong order

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

perl script prints parts of strings in the wrong order

M. Fioretti-2
hello,

I am trying to reuse an old perl script I wrote years ago, on an Ubuntu
16.-04LTS x86_64 box.
It behaves in a very odd way now, and I cannot figure out if it is the
code that is not
compatible with current versions of Perl, or if there is something VERY
strange happening
between the script, and the terminal it runs in.

The part of the script that works badly is this:

while (<INPUTFILE>) {
     chomp;
     my $LINE = $_;
     many lines that "clean up" $LINE, removing certain substrings etc...

     @F = split /:/, $LINE;
     print "CURL:==> $F[0] ++ $CURLRES{$F[0]} ;;\n";
}

when I run it, SOME of the printed lines (say 1 every 30) have the
expected format:

CURL:==> string_a ++ string_b ;;

but all the others are like this:


  ;;L:==> string_c ++ string_d ;;

that is, the three initial "CUR" characters are replaced by " ;;"

It's as if something had pasted the last three characters over the first
three ones.

I have no idea what is going wrong, and why. Any help is appreciated,

TIA,

Marco

--
http://mfioretti.com

--
ubuntu-users mailing list
[hidden email]
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
Reply | Threaded
Open this post in threaded view
|

Re: perl script prints parts of strings in the wrong order

Olivier Nicole-2
"M. Fioretti" <[hidden email]> writes:

> hello,
>
> I am trying to reuse an old perl script I wrote years ago, on an Ubuntu
> 16.-04LTS x86_64 box.
> It behaves in a very odd way now, and I cannot figure out if it is the
> code that is not
> compatible with current versions of Perl, or if there is something VERY
> strange happening
> between the script, and the terminal it runs in.
>
> The part of the script that works badly is this:
>
> while (<INPUTFILE>) {
>      chomp;
>      my $LINE = $_;
>      many lines that "clean up" $LINE, removing certain substrings etc...
>
>      @F = split /:/, $LINE;
>      print "CURL:==> $F[0] ++ $CURLRES{$F[0]} ;;\n";
> }
>
> when I run it, SOME of the printed lines (say 1 every 30) have the
> expected format:
>
> CURL:==> string_a ++ string_b ;;
>
> but all the others are like this:
>
>
>   ;;L:==> string_c ++ string_d ;;
>
> that is, the three initial "CUR" characters are replaced by " ;;"
>
> It's as if something had pasted the last three characters over the first
> three ones.
>
> I have no idea what is going wrong, and why. Any help is appreciated,

A wild guess, but there may be something strange in the input file,
like some lines contain a \r.

I would manually check the input file and also try to print the
resulting $F[0] alone on a line:

print "--$LINE--\n";
print "++$F[0]++\n";

Olivier

>
> TIA,
>
> Marco
>
> --
> http://mfioretti.com

--

--
ubuntu-users mailing list
[hidden email]
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
Reply | Threaded
Open this post in threaded view
|

SOLVED (sort of): perl script prints parts of strings in the wrong order

M. Fioretti-2
In reply to this post by M. Fioretti-2
as usual, I stumped in the solution one second after posting to the
list, but I
can't say I completely understand it, so any comment is welcome.

I piped the output of the script through "od -c", and saw lots of \r
characters
right in the places where pieces of strings were "swapped".

So I changed this:

chomp;

to this:

chomp;
s/\r//g;

and now everything works as intended. Problem is, *why* did I have to do
this?
I thought the "chomp" command in Perl also strips those \r characters
out, and
I am pretty sure it did, earlier.

Thanks,
Marco


On 2018-08-07 07:28, M. Fioretti wrote:

> hello,
>
> I am trying to reuse an old perl script I wrote years ago, on an
> Ubuntu 16.-04LTS x86_64 box.
> It behaves in a very odd way now, and I cannot figure out if it is the
> code that is not
> compatible with current versions of Perl, or if there is something
> VERY strange happening
> between the script, and the terminal it runs in.
>
> The part of the script that works badly is this:
>
> while (<INPUTFILE>) {
>     chomp;
>     my $LINE = $_;
>     many lines that "clean up" $LINE, removing certain substrings
> etc...
>
>     @F = split /:/, $LINE;
>     print "CURL:==> $F[0] ++ $CURLRES{$F[0]} ;;\n";
> }
>
> when I run it, SOME of the printed lines (say 1 every 30) have the
> expected format:
>
> CURL:==> string_a ++ string_b ;;
>
> but all the others are like this:
>
>
>  ;;L:==> string_c ++ string_d ;;
>
> that is, the three initial "CUR" characters are replaced by " ;;"
>
> It's as if something had pasted the last three characters over the
> first three ones.
>
> I have no idea what is going wrong, and why. Any help is appreciated,
>
> TIA,
>
> Marco

--
http://mfioretti.com

--
ubuntu-users mailing list
[hidden email]
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
Reply | Threaded
Open this post in threaded view
|

Re: perl script prints parts of strings in the wrong order

M. Fioretti-2
In reply to this post by Olivier Nicole-2
On 2018-08-07 07:37, Olivier wrote:

> A wild guess, but there may be something strange in the input file,
> like some lines contain a \r.

indeed, that was the case, thanks! See my other "SOLVED" reply.

What I don't understand is why the perl "chomp" function did not remove
those characters too, together with the \n ones.

I am pretty sure it worked that way, in the past.

Marco

--
http://mfioretti.com

--
ubuntu-users mailing list
[hidden email]
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
Reply | Threaded
Open this post in threaded view
|

Re: SOLVED (sort of): perl script prints parts of strings in the wrong order

Olivier Nicole-2
In reply to this post by M. Fioretti-2
"M. Fioretti" <[hidden email]> writes:

> as usual, I stumped in the solution one second after posting to the
> list, but I
> can't say I completely understand it, so any comment is welcome.
>
> I piped the output of the script through "od -c", and saw lots of \r
> characters
> right in the places where pieces of strings were "swapped".
>
> So I changed this:
>
> chomp;
>
> to this:
>
> chomp;
> s/\r//g;
>
> and now everything works as intended. Problem is, *why* did I have to do
> this?
> I thought the "chomp" command in Perl also strips those \r characters
> out, and
> I am pretty sure it did, earlier.

According to perl documentation, chomp removes the eding of a line that
corresponds to teh current value of $/ so it really depends what you set
into $/

Bests,

livier

>
> Thanks,
> Marco
>
>
> On 2018-08-07 07:28, M. Fioretti wrote:
>> hello,
>>
>> I am trying to reuse an old perl script I wrote years ago, on an
>> Ubuntu 16.-04LTS x86_64 box.
>> It behaves in a very odd way now, and I cannot figure out if it is the
>> code that is not
>> compatible with current versions of Perl, or if there is something
>> VERY strange happening
>> between the script, and the terminal it runs in.
>>
>> The part of the script that works badly is this:
>>
>> while (<INPUTFILE>) {
>>     chomp;
>>     my $LINE = $_;
>>     many lines that "clean up" $LINE, removing certain substrings
>> etc...
>>
>>     @F = split /:/, $LINE;
>>     print "CURL:==> $F[0] ++ $CURLRES{$F[0]} ;;\n";
>> }
>>
>> when I run it, SOME of the printed lines (say 1 every 30) have the
>> expected format:
>>
>> CURL:==> string_a ++ string_b ;;
>>
>> but all the others are like this:
>>
>>
>>  ;;L:==> string_c ++ string_d ;;
>>
>> that is, the three initial "CUR" characters are replaced by " ;;"
>>
>> It's as if something had pasted the last three characters over the
>> first three ones.
>>
>> I have no idea what is going wrong, and why. Any help is appreciated,
>>
>> TIA,
>>
>> Marco
>
> --
> http://mfioretti.com

--

--
ubuntu-users mailing list
[hidden email]
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
Reply | Threaded
Open this post in threaded view
|

Re: SOLVED (sort of): perl script prints parts of strings in the wrong order

M. Fioretti-2
On 2018-08-07 07:56, Olivier wrote:

> According to perl documentation, chomp removes the eding of a line that
> corresponds to teh current value of $/ so it really depends what you
> set
> into $/

I know that. That is the reason why I was caught off guard. I have
become
so accustomed, for many years, to have $/ = "\r\n" by default on any
Linux
box I worked on, that I thought to anything but a different value for
it.

Of course, my memory may very well fail me on this. At this point, mine
is just a curiosity like "did $/ change at some point in Ubuntu, or
better:
in Perl as packaged for Ubuntu?" But just an unimportant curiosity,
really, no need to
reply unless you really have the answer on the top of your head.

Thanks,
Marco
--
http://mfioretti.com

--
ubuntu-users mailing list
[hidden email]
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
Reply | Threaded
Open this post in threaded view
|

Re: SOLVED (sort of): perl script prints parts of strings in the wrong order

Colin Watson
On Tue, Aug 07, 2018 at 08:07:05AM +0200, M. Fioretti wrote:
> On 2018-08-07 07:56, Olivier wrote:
> > According to perl documentation, chomp removes the eding of a line that
> > corresponds to teh current value of $/ so it really depends what you set
> > into $/
>
> I know that. That is the reason why I was caught off guard. I have become
> so accustomed, for many years, to have $/ = "\r\n" by default on any Linux
> box I worked on, that I thought to anything but a different value for it.

That certainly isn't my memory, and I'd find it extremely surprising for
a Unix build of perl to have $/ set to anything but "\n" by default.  I
really think you're misremembering something here.

Perhaps you're used to doing file I/O via the :crlf layer (see
PerlIO(3perl)), which would convert raw \r\n to \n when reading it from
files?  But on Unix you'd have to be a bit careful to apply that only on
input, not output.

A quick web search finds plenty of old posts asking similar questions,
indicating that this isn't a recent change.  For instance:

  https://www.perlmonks.org/?node_id=549385
  https://www.perlmonks.org/?node_id=504626
  https://stackoverflow.com/questions/650743/in-perl-how-to-remove-m-from-a-file
  https://perlmaven.com/chomp

> Of course, my memory may very well fail me on this. At this point, mine
> is just a curiosity like "did $/ change at some point in Ubuntu, or better:
> in Perl as packaged for Ubuntu?" But just an unimportant curiosity, really,
> no need to reply unless you really have the answer on the top of your
> head.

I don't have any properly old environments to hand right now, but at
least Debian 7 and Ubuntu 12.04 behaved the same way.  A quick test
case:

  $ perl -e '$_ = "foo\r\n"; chomp; print' | od -tx1
  0000000 66 6f 6f 0d
  0000004

--
Colin Watson                                       [[hidden email]]

--
ubuntu-users mailing list
[hidden email]
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
Reply | Threaded
Open this post in threaded view
|

Re: SOLVED (sort of): perl script prints parts of strings in the wrong order

M. Fioretti-2
On 2018-08-07 10:08, Colin Watson wrote:
> ... I really think you're misremembering something here.

probably yes, and this explanation you gave sounds quite plausible:

> Perhaps you're used to doing file I/O via the :crlf layer (see
> PerlIO(3perl)), which would convert raw \r\n to \n when reading it from
> files?  But on Unix you'd have to be a bit careful to apply that only
> on
> input, not output.

thanks again!

Marco

--
http://mfioretti.com

--
ubuntu-users mailing list
[hidden email]
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users