I want to write Perl code which checks for stop codon and replaces it with NNN. I have written code as follows:
#!/usr/bin/perl
use strict;
use warnings;
# Check if the file name is provided as an argument
my $file = $ARGV[0];
open(my $fh, "<", $file) or die "Unable to open file";
my $sequence = "";
my $id = "";
while (my $line = <$fh>) {
chomp($line);
if ($line =~ /^>/) {
if ($sequence ne "") {
# Split sequence into codons
my @codon = $sequence =~ /.{1,3}/g;
print join(" ", @codon), "\n";
print $id, "\n";
# Check for stop codons and replace them with "NNN"
foreach my $codon (@codon) {
if ($codon =~ /^(TAG|TGA|TAA)/) {
$codon = "NNN";
}
}
}
$sequence = "";
$id = $line;
} else {
$sequence .= $line;
}
}
# Print last sequence
if ($sequence) {
my @codon = $sequence =~ /.{1,3}/g;
print join(" ", @codon), "\n";
print $id, "\n";
}
close($fh) or die "Unable to close file";
Which should take input from command line, i.e fasta sequence, and process it: Split the sequence in multiples of three, replace stop codon with NNN.
I have the input sequence like:
>header
ATGGACCAGCAGCAGCAGCAGCAGTAA
I was expecting some thing like:
>header
ATGGACCAGCAGCAGCAGCAGCAGNNN
Also it did not process the last sequence in the file, and I got output as:
>header
ATG GAC CAG CAG CAG CAG CAG CAG TAA
Except header of the first sequence and sequence of the last header were missing.
The substitution did not occur because the logic of your program is incorrect. The following condition is never true, so your replacement code does not get executed:
if ($sequence ne "")
Then, in the # Print last sequence code, you don't try to do the substitution.
Here is a self-contained example that does the substitution:
use warnings;
use strict;
while (my $line = <DATA>) {
chomp($line);
if ($line =~ /^>/) {
print "$line\n";
} else {
# Split sequence into codons
my @codon = $line =~ /.{1,3}/g;
# Check for stop codons and replace them with "NNN"
foreach my $codon (@codon) {
if ($codon =~ /^(TAG|TGA|TAA)/) {
$codon = "NNN";
}
}
print join(" ", @codon), "\n";
}
}
__DATA__
>header
ATGGACCAGCAGCAGCAGCAGCAGTAA
Output:
>header
ATG GAC CAG CAG CAG CAG CAG CAG NNN
See also: bioperl
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With