Entering edit mode
2.1 years ago
Hi. I have a list of headers for which I want to find the sequence in a large fastq file to write a new fastq file.
List of headers list.txt
fastq file largefile.fastq
I am currently using the following command, but it takes a really long time.
while read HEADER
grep -A 1 -m 1 $HEADER largefile.fastq >> new.fastq
done < headers list.txt
Is there a faster way of doing this?
Thank you all for the help! In the end I chose to write a short python script rather than use bash as the reference fasta is not so large that it cannot be read into memory.