Printing Reverse Complement Of Dna In Single-Line Perl
6
2
Entering edit mode
11.6 years ago
soosus ▴ 90

I want to write a quick single-line perl script to produce the reverse complement of a sequence of DNA. The following isn't working for me, however:

$ cat sample.dna.sequence.txt | perl -ne '{while (<>) {$seq = $_; $seq =~ tr /atcgATCG/tagcTAGC/; $revComp = reverse($seq); print $revComp;}}'

Any suggestions?

perl bash sequence • 17k views
ADD COMMENT
1
Entering edit mode

The simpler version is: cat test.txt | perl -pe 'chomp;tr/ACGTacgt/TGCAtgca/;$_=reverse."\n"'

ADD REPLY
0
Entering edit mode

an even simpler version is cat test.txt | perl -pe 'tr/ACGTacgt/TGCAtgca/;reverse'

ADD REPLY
6
Entering edit mode
9.0 years ago
Shicheng Guo ★ 9.6k

Why use single line. keep the code to your script fold and use these routine script everyday.

Like: perl /home/usr/bin/rcdna.pl ATCGATGCG

#!/usr/bin/perl
use strict;
my $dna=shift @ARGV;
my $rcdna= & reverse_complement_IUPAC($dna);
print "$rcdna\n";

sub reverse_complement_IUPAC {
        my $dna = shift;

        # reverse the DNA sequence
        my $revcomp = reverse($dna);

        # complement the reversed DNA sequence
        $revcomp =~ tr/ABCDGHMNRSTUVWXYabcdghmnrstuvwxy/TVGHCDKNYSAABWXRtvghcdknysaabwxr/;
        return $revcomp;
}
ADD COMMENT
5
Entering edit mode
11.6 years ago

shell script? you don't need perl:

tr -d "\n " < input.txt | tr "[ATGCatgcNn]" "[TACGtacgNn]" | rev
ADD COMMENT
2
Entering edit mode

Here is the modified solution that can handle IUPAC degeneracies:

tr -d "\n" < input.txt | tr "[ATGCUatgcuNnYyRrSsWwKkMmBbDdHhVv]" "[TACGAtacgaNnRrYySsWwMmKkVvHhDdBb]" | rev
ADD REPLY
0
Entering edit mode

Say I want to do it with perl, for illustrative purposes. How do I go about that? And why doesn't my code work?

ADD REPLY
0
Entering edit mode

you need to concatenate all the strings in the while lool. Once everything is read, call tr , reverse and print.

ADD REPLY
2
Entering edit mode
11.6 years ago

If you want to use Perl:

$ echo "CAAT" | perl -le '{$seq = <STDIN>; chomp $seq; $seq =~ tr /atcgATCG/tagcTAGC/; $seq = reverse($seq); print $seq; }'
ATTG

STDIN is a default handle, so you could just take that out for brevity. I like being explicit if it isn't too wordy, though.

ADD COMMENT
0
Entering edit mode

for this matter I usually go for this compact yet self-explanatory code inside my scripts

echo -n "CAAT" | perl -pe '{tr/atcgATCG/tagcTAGC/; $_=reverse}'

although I would agree that if reverse complementing is the only thing you want to do, you'll probably find shell scripting faster.

ADD REPLY
1
Entering edit mode
11.6 years ago
Asaf 10k

This will give you the reverse complement of each line, try something like:

$tail -n +2 seq.fa | tr -d "\n" |tr "acgtACGT" "tgcaTGCA" |rev
ADD COMMENT
1
Entering edit mode
11.6 years ago
soosus ▴ 90
perl -0777pe's/\n //g; tr/ATGCatgcNn/TACGtacgNn/; $_ = reverse $_;' sample.sequence.txt
ADD COMMENT
0
Entering edit mode
8.7 years ago
winter_li ▴ 60

Hi , maybe you can write a perl script like the following ,

  use Bio::Seq;

  $seqobj = Bio::Seq->new(-seq => $seq);
  $Query = $seqobj->revcom()->seq;

Good luck !

ADD COMMENT

Login before adding your answer.

Traffic: 1639 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6