Question

Extracting the aligned sequence from a BLAST output in a multi-fasta file

0

Entering edit mode

6.1 years ago

Ming ▴ 110

Dear All,

I am trying to extract the aligned sequences from my query search from a BLAST output in a single multi-fasta file. How do I go about doing so?

Thank you in advanced.

blast • 5.2k views

ADD COMMENT • link updated 6.1 years ago by flogin ▴ 280 • written 6.1 years ago by Ming ▴ 110

score 2 · Answer 1 · 2019-04-05

2

Entering edit mode

6.1 years ago

flogin ▴ 280

What is your blast output format? and which sequences you want to extract? queries or subjects?

If your output format is 6 (outfmt 6), you can use the information of query/subject names and query/subject positions.

For example, if you need to extract the positions of the subjects that show any match, you can cut the columns of subject name (2), subject start (9) and subject end (10), and use this information in Bedtools (https://bedtools.readthedocs.io/en/latest/content/tools/getfasta.html)

If you need all-region (independent of alignment region) you can retrieve the name of sequences and use the seqtk tool (Seqtk subseq: structure of file name.lst)

ADD COMMENT • link 6.1 years ago by flogin ▴ 280

0

Entering edit mode

Probably are solutions more efficient, but I'm still a beginner in bioinformatics.

ADD REPLY • link 6.1 years ago by flogin ▴ 280

1

Entering edit mode

@flogin, thank you very much for pushing me in the correct direction!

ADD REPLY • link 6.1 years ago by Ming ▴ 110

0

Entering edit mode

You're welcome !!! :D

ADD REPLY • link 6.1 years ago by flogin ▴ 280