One way to control the line endings in Perl is to work with text files in binary mode. Usually this is not necessary because Perl tries to be intelligent about how it manages line endings. It does this by using the correct line ending for the platform it’s running on, so LF for Unix and CRLF for Windows. However, this is not always the desired behavior. One use-case for Perl is to make automatic changes to existing text files. One way to do this is to read the file, make the changes, and then re-write the file. The desired behavior in this case is to leave the line endings as they were, but the default Perl behavior might change them.
The following Perl example works around this issue by reading and writing text files in binary mode:
#!/usr/bin/perl
use strict; use warnings;
my ($fin_name, $fout_name) = @ARGV;
# Open files in binary mode open my $fin, ‘<:raw’, $fin_name or die “input file error $fin_name“; open my $fout, ‘>:raw’, $fout_name or die “output file error $fout_name“;
sub process_line { my ($line, $eol) = @_; $line =~ s/foo/bar/g; print $fout $line; print $fout $eol; }
# Read the input file in blocks and scan for lines my $block = “”; while (read $fin, $block, 128, length($block)) { while ($block =~ /([^\r\n]*)(\R)/) { process_line($1, $2); # Process the line, keeping the ending $block = substr $block, length($1) + length($2), length($block); } }
close $fin; close $fout; |
In Perl, binary files are read in fixed sized blocks. This example reads the input file in blocks, and appends each block to the end of a variable. After each block, the variable is scanned for lines. Each line is processed and then removed from the variable. The processing function is given the line data without an ending, and the line ending separately. After processing, the line can be written to the output file with its original line ending.