I have a list of ids and two files (male and female). I want to find sequences and quality scores for all ids in my list.
I have:
list of ids
2 fastq files (eventually 2 fasta files and 2 qual files)
and I want to get:
extracted_sequences_based_on_ids.fastq (eventually extracted_sequences_based_on_ids.fasta and extracted_sequences_based_on_ids.qual) (can be two files that I merge afterwards)
Yes, I can write the script that will be for each id looking up the sequence and quality score and writing the result to a new file, but I would not like to reinvent the wheel in case this is something easy to perform with already existing tools:) (and might be handy for other biostars users as well)
Thanks a lot.
Duplicated post: http://www.biostars.org/post/show/10353/how-to-efficiently-parse-a-huge-fastq-file/