How to extract the CDS from a fasta file using the protein IDS ?
0
0
Entering edit mode
2.4 years ago
sunnykevin97 ▴ 990

Hi,

Finally, end up with good amount of orthologs (1500).

I'd like to extract the CDS of the protein sequences ?

Given a IDS.protein.txt file, I'd like search against the CDS sequences of all the 10 species.

For example -

cat  IDS.protein.txt

>g66422.t1
>XP_034399799.1
>g40125.t1
>g66683.t1
>g53726.t1
>g25019.t1
>g26815.t1
>LipSG014613.t1
>hadal41775
>evm.model.Contig3137_pilon.3_pasa2.longest.filter_rm

I'm interested in searching the 1st ID against the 1.fasta file, and 2nd ID against 2.fasta file

extracting ---

>g66422.t1 (1st ID) vs. **1.fasta**
>XP_034399799.1 (2nd ID) vs. **2.fasta**
....
....
....
....
>evm.model.Contig3137_pilon.3_pasa2.longest.filter_rm (10th ID) vs. **10.fasta**

Some suggestions please.

gene genome protein • 365 views
ADD COMMENT

Login before adding your answer.

Traffic: 2888 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6