New example. I used a \t
(tab) to separate the new column from the old text, but you can use whatever you want.
use strict;
use warnings;
use 5.020;
use autodie;
use Data::Dumper;
my @new_column = (
"Lineage",
"cellular organisms; XXX;",
"cellular organisms; YYY;",
);
open my $INFILE, '<', 'data.txt'; #Standard way to open a file for reading (autodie above handles errors opening file)
open my $OUTFILE, '>', 'new_data.txt'; #Standard way to open a file for writing
my $i = 0;
while(my $line = <$INFILE>) { #Read the file line by line.
chomp $line; #Remove the newline at the end of the line.
say {$OUTFILE} "$line\t$new_column[$i]"; #Write to the file. $new_column[0] is the first element in @new_column.
++$i;
}
close $INFILE;
close $OUTFILE;
Perl has facilities that allow you (seemingly) to do inplace editing of a file--just like awk
. A backup file will be created first so that you don't lose your original data if perl crashes while you are editing a file. If you open a file in write mode, the file is erased, and if perl then crashes, goodbye data.
The perl variable $^I
is set to undef
by default. If you set $^I
to a string, then you turn on inplace editing. After turning on inplace editing, it takes a few more steps to set things up properly. After things are setup for inplace editing, any output sent to stdout gets written to the file.
Starting file:
$ cat data.txt
Taxon Id Common Name Scientific Name
9606 human Homo sapiens
9483 white-tufted-ear marmoset Callithrix jacchus
Perl script:
use strict;
use warnings;
use 5.020;
use autodie;
use Data::Dumper;
my @new_column = (
"Lineage",
"cellular organisms; XXX;",
"cellular organisms; YYY;",
);
{
local $^I = ".bak"; #Blank string for no backup file.
local @ARGV = "data.txt"; ####REQUIRED. Cannot use: while (my $line = <$INFILE>)
## and get inplace editing
my $i = 0;
while (my $line = <>) { #The "diamond" operator reads from the files specified in @ARGV
chomp $line;
say "$line\t$new_column[$i]"; #Written to file.
++$i;
}
}
Ending file:
$ cat data.txt
Taxon Id Common Name Scientific Name Lineage
9606 human Homo sapiens cellular organisms; XXX;
9483 white-tufted-ear marmoset Callithrix jacchus cellular organisms; YYY;
You said that you wanted to write the modified data to a new file. In that case, inplace editing is irrelevant, and it is not something you need to know about. So, look at the new example I posted.
If you want to learn about inplace editing anyway, then there are three possibilities:
$^I = undef;
(the default)$^I = '';
$^I = '.bak';
If you assign any string to
$^I
it turns on inplace editing.If you assign a blank string to
$^I
no back up file will be created.If you assign any non-blank string to
$^I
, a backup file will be created, and the name of the backup file will be:"originalname" . $^I
. So if the original file name was'data.txt'
and the string assigned to$^I
was'.1.2.3hello'
, then the backup file would be named:data.txt.1.2.3hello
. If the original file name was'my_data'
, and'-!-hello'
was assigned to$^I
, then the backup file would be named:'my_data-!-hello'
.local
tells perl to temporarily change the value of the specified global variable for the duration of the current scope:When the closing brace is encountered, perl magically restores all the variables declared to be local to the values they had before the opening brace was encountered. When using perl's predefined global variables, it's considered good practice to only change them for as long as you need them.