Entering edit mode
9.4 years ago
Rashedul Islam
▴
480
Dear All,
I have a list of genes id e.g.
gene01
gene22
gene27 and so on.
I need extract the gene names with fasta sequence from assembly fie. Assembly file looks:
>gene01
ATAGCGATCCCCCTTTTTCCTT
>gene02
ATACCCCCGCGAT
>gene03
ATACCCAAAAAAACCGCGAT and so on.
Can anyone help me to write a shell script that will search gene names of my gene list in the assembly file and will give the output with associated DNA sequence. Example output for gene01 is:
>gene01
ATAGCGATCCCCCTTTTTCCTT
I'm not sure if this website is like Stackoverflow, but you should really post what you tried instead of just asking people to do your work for you. It looks like you did not try anything at all, and while many people on this site don't have much programming experience, this is a relatively simple task which you could find the solution to on google or code in a few mins.
Thank you for your reply. Your answer helped. I found this link: http://unix.stackexchange.com/questions/156783/getting-matched-fasta-file.