Entering edit mode
2.5 years ago
sunnykevin97
▴
990
HI,
I had a single fasta file with 60,000 bacterial genomes, I'd like to extract the entire fasta sequence which has a keyword Psychro in its header ( Ex - Psychrobacter - 18982 genomes).
I'm aware we can extract a subseq using the seqtk
seqtk subseq test.fa test.txt
But I want to extract entire fasta sequence by providing the fasta headers in the test.txt
I provided example fasta header names (~18982 totally genomes I want to extract)
>NZ_CAJHBU010000049.1 **Psychrobacter vallis** isolate Psychrobacter vallis CMS39, whole genome shotgun sequence
>NZ_CAJHBM010000029.1 **Psychrobacter sp. JCM** 18903 isolate Psychrobacter sp. JCM18903, whole genome shotgun sequence
>NZ_CAJHBB010000047.1 **Psychrobacter sanguinis** isolate Psychrobacter sanguinis 13983, whole genome shotgun sequence
>NC_007204.1 **Psychrobacter arcticus** 273-4, complete sequence
>NC_007969.1 **Psychrobacter cryohalolentis K5**, complete sequence
>NC_007968.1 **Psychrobacter cryohalolentis K5** plasmid 1, complete sequence
>NC_008709.1 **Psychromonas ingrahamii 37**, complete sequence
>NC_020802.1 **Psychromonas sp. CNPT3,** complete sequence
>NC_018721.1 **Psychroflexus torquis ATCC** 700755, complete sequence
Suggestions please!
Well I tried using
Works fine!