This is a single-line code solution. No residual ">" before each header.

Question

Extraction Of Header Of Sequences In Fasta File

2

Entering edit mode

13.0 years ago

Mohammad Reza Bakhtiarizadeh ▴ 350

Hi all I have a fasta file that i want to extract just header of sequences. is there any perl code or some thing like this to do that? thanks a lot in advance

regards

perl fasta python parsing • 50k views

ADD COMMENT • link updated 2.7 years ago by Michael 55k • written 13.0 years ago by Mohammad Reza Bakhtiarizadeh ▴ 350

1

Entering edit mode

By "header", you mean everything after the ">"? Or just some part of everything after the ">"? Or including the ">"? It's important to be specific since a lot of people misunderstand "header".

ADD REPLY • link 13.0 years ago by Neilfws 49k

0

Entering edit mode

I just want everything after the ">". and i have to say that i am not familiar with perl and i want a perl code to run. if possible help me. thanks a lot. regards

ADD REPLY • link 13.0 years ago by Mohammad Reza Bakhtiarizadeh ▴ 350

0

Entering edit mode

err, why don't you just post your code then?

ADD REPLY • link 13.0 years ago by Michael 55k

score 16 · Answer 1 · 2012-01-07

16

Entering edit mode

13.0 years ago

Frédéric Mahé ★ 3.2k

For perl code, you can visit http://www.bioperl.org/wiki/Main_Page. If you just want to extract the headers, on a Linux/Unix system, a simple grep "^>" myfile.fasta should work.

ADD COMMENT • link 13.0 years ago by Frédéric Mahé ★ 3.2k

Ram · Answer 2 · 2012-01-08

14

Entering edit mode

13.0 years ago

Michael 55k

Why so complicated? ;) Only the header in a fasta file contains > so you can use grep :

grep -e ">" my.fasta

or awk to remove the >:

$ awk 'sub(/^>/, "")' 
>aksdjfljfd
aksdjfljfd

ADD COMMENT • link updated 5.4 years ago by Ram 44k • written 13.0 years ago by Michael 55k

0

Entering edit mode

Thanks so much, but i am not familiar with perl code. i need a complete code to run it. if possible guide me more. thanks again

ADD REPLY • link 13.0 years ago by Mohammad Reza Bakhtiarizadeh ▴ 350

0

Entering edit mode

Thanks. I fixed my problem. regards

ADD REPLY • link 13.0 years ago by Mohammad Reza Bakhtiarizadeh ▴ 350

0

Entering edit mode

this is not perl, it's unix ;)

ADD REPLY • link 13.0 years ago by Michael 55k

0

Entering edit mode

what about i want to extract the header and their belonging sequences?

ADD REPLY • link 4.4 years ago by bioinfo • 0

0

Entering edit mode

$ awk 'sub(/^>/, "")' your_file.fasta > desired_headers.txt

This is a single-line code solution. No residual ">" before each header.

Learned from the blog "AWK: the substr command to select a substring" by Thomas Cokelaer

https://thomas-cokelaer.info/blog/2011/05/awk-the-substr-command-to-select-a-substring/

ADD REPLY • link 2.7 years ago by ursadhip • 0

0

Entering edit mode

Yeah, sure, no rocket science ....

ADD REPLY • link 2.7 years ago by Michael 55k

Ram · Answer 3 · 2012-01-07

7

Entering edit mode

13.0 years ago

Caddymob ★ 1.0k

Expression in perl would be basically the same as the grep above (m/^>/).. There are easier 1-liner ways to do this, but this is a basic outline of the perl code that should be pretty readable.

#!/usr/bin/perl

open(FASTA, "<your.fa");
while(<FASTA>) {
    chomp($_);
    if ($_ =~  m/^>/ ) {
        my $header = $_;
        print "$header\n";
    }
}

ADD COMMENT • link updated 5.4 years ago by Ram 44k • written 13.0 years ago by Caddymob ★ 1.0k

0

Entering edit mode

thanks so much. your code is ok but how can i write it in a text file. i am not familiar with perl code.

thanks

ADD REPLY • link 13.0 years ago by Mohammad Reza Bakhtiarizadeh ▴ 350

0

Entering edit mode

thanks so much i fixed my problem. regards

ADD REPLY • link 13.0 years ago by Mohammad Reza Bakhtiarizadeh ▴ 350