Fasta file edition
2
0
Entering edit mode
7.8 years ago
vishwaas1704 ▴ 30

Is there any way through the command line to add certain specific nucleotide string to the start of the fasta sequence in a multifasta sequence file. For example, I want to add primer sequence to the start of the fasta sequence.

>sequence
ATGCATGCATGC

REQUIRED

>SEQUENCE
PRIMER + ATGCATGCATGC

Thanks in advance.......

sequence • 2.0k views
ADD COMMENT
0
Entering edit mode

Can you take a look to see if that is what you are looking for (I adjusted the format)?

I think what you want is

>sequence
PRIMER + ATGCATGCATGC

or

>sequence
PRIMERATGCATGCATGC
ADD REPLY
0
Entering edit mode

yes actually i just wanted to represent that I want something to add in front of the sequence.. Pimer sequence - GCANNTGTNNGCATNNGCNAA

i should have written FROM

>fasta identifier
original sequence

TO

>fasta identifier
GCANNTGTNNGCATNNGCNAAoriginal sequence
ADD REPLY
0
Entering edit mode

Then use the answer @Alex provided below.

ADD REPLY
2
Entering edit mode
7.8 years ago

Assuming your FASTA records are single-line:

$ awk -vPRIMER="ACTG" '{if ($0~/^>/) { print $0; } else { print PRIMER$0; }}' in.fa > out.fa

If not, you can pre-process your FASTA file to make it single-line.

ADD COMMENT
0
Entering edit mode

Thank you alex and thank you genomeax2

ADD REPLY
0
Entering edit mode

Can this code be modified to add primer sequence in the end of the complete sequence, what we have generated in the previous code? And I assume we will already use the reverse complement sequence so that we dont have to include that thing in the code.

ADD REPLY
1
Entering edit mode
7.8 years ago
Daniel ★ 4.0k

The first ever perl script I wrote does this! Adds a different barcode based on a common fasta header identifier, which I think that's what you want?

All you do is add_barcode.pl my_file.fasta then it'll ask you for the shared ID in the header, then ask for what the barcode you want to add is. Not the best code in hindsight, but does the job!

#!/usr/bin/perl    
use 5.010;

print "USAGE: add_barcode.pl my_file.fasta\n"

chomp($file = $ARGV[0]);
$file =~ s/\..*//;

print "what is the match (case-insensitive):\n";
chomp($site = <STDIN>);

print "what is the sites barcode:\n";
chomp($barcode = <STDIN>);


chomp(@lines = <>);

open MODIFIED, ">$file" . "_barc.fas";
select MODIFIED;
foreach (@lines){
        if (/$site/i) {
        print "$_\n$barcode";
        $count += 1;
        }else{
        print "$_\n";
        }
}
close MODIFIED;
select STDOUT;


print "modified $site $count times\n";
ADD COMMENT
0
Entering edit mode

Thank you very much Daniel !

ADD REPLY
0
Entering edit mode

vishwaas1704 : If above answer(s) have solved your question then please accept one (or more) of them (green check mark). That indicates that the question has been solved.

ADD REPLY

Login before adding your answer.

Traffic: 2712 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6