How to extract sequences from multiple fasta files according to my list files
1
0
Entering edit mode
7.8 years ago

I have done orthologous analysis in four fasta files from four different species. After orthologous analysis, I obtain orthologous gene list. For example: orthologous gene 1: 1st species (gene_id1), 2nd species (gene_id2), 3rd species (gene_id3), 4th species (gene_id4) orthologous gene 2: 1st species (gene_id1'), 2nd species (gene_id2'), 3rd species (gene_id3'), 4th species (gene_id4') ......

I would like to extract sequences from each fasta files from different species and put them into separate fasta files according to list. How could I perform these requests?

sequence blast • 1.8k views
ADD COMMENT
1
Entering edit mode
7.8 years ago
EVR ▴ 610

seqtk subseq -l 100000 input.fasta gene_list.txt

ADD COMMENT
0
Entering edit mode

thank you for providing information. May I ask a further question? Would there be specific format in gene_list.txt? And by the way, I have four input.fasta. How do I extract orthologous sequences from four independent fasta into each orthologous gene fasta files?

ADD REPLY
1
Entering edit mode
gene_list.txt

gene_id1 gene_id2 gene_id3

Use the above command separately for every fasta file you have

ADD REPLY

Login before adding your answer.

Traffic: 1714 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6