How to extract sub fasta sequences from a multiline fasta file ?
1
0
Entering edit mode
3.6 years ago
sunnykevin97 ▴ 990

Hi

Multiline fasta 
>data1
AADFGHJKASDFG
ASDFGHJKLFGHJK
ZXCVBNMMMMM
>data2
POIUYTREWPOIU
LKJHGFDSSALKJH
MNBVCXZLKJHGF
>data3
QWERTYUIOQWE
ERTYUIRTYUITYUI
ASDFGHJSDFGHS

Interested in extracting only >data2 sequence into text file.

>data2
POIUYTREWPOIU
LKJHGFDSSALKJH
MNBVCXZLKJHGF

Using file option that I'd great. I had a 1000's of ID's in a text file. I tried using grep commands able to extract only 1 line.

 >data2
    POIUYTREWPOIU

Help!

RNA DNA • 1.6k views
ADD COMMENT
2
Entering edit mode
ADD COMMENT
0
Entering edit mode

Its not working, generates an empty file.

faSomeRecords fsd.fa mfsd.txt out.fa

ADD REPLY
2
Entering edit mode

You need to list plain fasta header names one per line in mfsd.txt. There can be no > in listing.

ADD REPLY
0
Entering edit mode

Other than faSomeRecords is their any tools that can do the same ? Actually it doesn't have append(>>) option. That doesn't solve my problem yet. Help!

ADD REPLY
0
Entering edit mode

What do you mean it does not have append option? All selected sequences should be in out.fa. If you are doing this for different datasets, just cat output files together. cat out1.fa out2.fa > final.fa.

You need to improvise (especially if you are not a programmer yourself) when needed. You will rarely find tools that do exactly what you need since you may be the only person needing a tool that works like that.

ADD REPLY
0
Entering edit mode

I tried the same, it worked.

Thanks

ADD REPLY

Login before adding your answer.

Traffic: 2069 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6