Hi
I had a text file with a list of sequence ID's 1....1000. (prefix - OG00*) **Text file:
OG0010960
OG0010966
OG0010968
OG0010972
OG0010979
OG0010980
OG0010981
OG0010982
OG0010983
OG0010984
In a directory, I had 66170 fasta files each fasta seq has a unique ID.
OG0066161.fa
OG0066162.fa
OG0066162.fa
OG0056185.fa
OG0056185.fa
OG0056185.fa
OG0000001.fa
OG0000002.fa
I'm wish to extract only the fasta files of the given ID's in the text file ?
Need some help, I tried to do using grep it doesn't work out.
Thanks
Kevin
Assuming that ids.txt (file with ids), fasta files and an empty directory by name 'extracted_files' exists within same directory
Please remove echo after checking the dry-run output, to copy the files from existing directory to the new directory.
If you have headers in ids.txt (file with ids) and if you want to extract sequences with those headers, try seqkit:
seqkit -in option I'm unable to find in the help file ?
I missed
grep
in the function. Edited the function.I am occasionally amazed by people's patience on this site.
This was an insufficiently explained original post (
I tried to do using grep it doesn't work out.
) where better explanation could have saved everyone lots of time. Even with all the attempts to help, the OP is only responding withIt doesn't work
without any useful details. As in: you guys go ahead and keep feeding me ideas, and I will give you monosyllabic responses. Yet more suggestions keep coming. Kudos to you guys. I am very happy that there is no down-vote option, because these kinds of exchanges definitely tempt me.Sometimes I do wish for Stack Exchange like options. This is purely a grep question, and with OP's attitude, it should have been closed way back if not for a bunch of us indulging them.
Show us the first few lines of your ID file and a few sample FASTA headers so we understand your problem better.
I edited the post.