Hi guys,
As shown below, I have two files a fasta file (containing a sequences and sequence IDs) and a text file containing a list of sequence ID.
I would like to extract the respective sequence for each of the sequence IDs in the text file. I have tried a number of times, but I simply cannot even open the fasta file - I'm a total beginner with regards to bioruby - if you could explain any answers in detail, I would be highly grateful.
EDIT: Could you give your answers using Ruby or BioRuby. Thanks
e.g. EXAMPLE FASTA FILE
>CATH_RAT
MWTALPLLCAGAWLLSAGATAELTVNAIEKFHFTSWMKQHQKTYSSREYSHRLQVFANNWRKIQAHNQRN
HTFKMGLNQFSDMSFAEIKHKYLWSEPQNCSATKSNYLRGTGPYPSSMDWRKKGNVVSPVKNQGACGSCW
>CATH_MOUSE
TFSTTGALESAVAIASGKMMTLAEQQLVDCAQNFNNHGCQGGLPSQAFEYILYNKGIMGEDSYPYIGKNG
QCKFNPEKAVAFVKNVVNITLNDEAAMVEAVALYNPVSFAFEVTEDFMMYKSGVYSSNSCHKTPDKVNHA
>CATH_HUMAN
MNPTLILAAFCLGIASATLTFDHSLEAQWTKWKAMHNRLYGMNEEGWRRAVWEKNMKMIELHNQEYREGK
HSFTMAMNAFGDMTSEEFRQVMNGFQNRKPRKGKVFQEPLFYEAPRSVDWREKGYVTPVKNQGQCGSCWA
>CATH_MONKEY
FSATGALEGQMFRKTGRLISLSEQNLVDCSGPQGNEGCNGGLMDYAFQYVQDNGGLDSEESYPYEATEES
CKYNPKYSVANDTGFVDIPKQEKALMKAVATVGPISVAIDAGHESFLFYKEGIYFEPDCSSEDMDHGVLV
EXAMPLE .TXT FILE
>CATH_MOUSE
>CATH_HUMAN
Many Thanks
how should we extract the multiple sequence in fasta format by using the gene_id through samtools?
@Bulbul Did you find the solution?
@matted Would you please elaborate?