Entering edit mode
3.7 years ago
pinn
▴
210
Hi
I had 1000's of Multiline text files with sequence ID's, each text file has only 9 sequence ID's.
Example file:
>gene1
>gene2
>gene3
>gene4
>gene5
>gene6
>gene7
>gene8
>gene9
I'm interested in matching the gene1 ID to a ex1fasta and extracting only the gene1 ID sequence to a separate file ?
gene1 ID to ex1fasta >> out.fa
gene2 ID to ex2fasta >> out.fa
gene3 ID to ex3fasta >> out.fa
gene4 ID to ex4fasta >> out.fa
gene5 ID to ex5fasta >> out.fa
gene6 ID to ex6fasta >> out.fa
gene7 ID to ex7fasta >> out.fa
gene8 ID to ex8fasta >> out.fa
gene9 ID to ex9fasta >> out.fa
I tried using this post it, I'm unable to How to extract fasta sequences and only its ID's, based on the subsequence fasta numbers from a main fasta file ? reproduce the result for my analysis. suggestions.
In my case, it doesn't work.
The out__.fa is an empty file.
my question is quite simple I would like to search only for the 1st TRINITY ID in my test__.txt
to the .cds (fasta file). 2nd TRINITY ID vs. 2nd .cds
I hope I presented here better way. Suggestions.
If you jsut want to extract the sequence based on Id, I not sure if you have one fasta file or multiple fasta file. you can concateante all the fasta file and extract sequence for id of your interest. see if this works for you.