Entering edit mode
5.0 years ago
johnsonn573
▴
10
I have a 24-line barcodes.txt file, each line of which is a 6-letter barcode.
I have ~30 million reads in my fastq file (big.fastq). The first 6 characters of every read is 1 of the 24 barcodes in barcodes.txt.
I would like to generate 24 new txt files. I want the lines of the new txt files to correspond to the read numbers that begin with that barcode. For example, if the first barcode is AACAGA, I would like the first new txt file to be the numbers of all the reads in big.fastq with the barcode AACAGA.
What have you tried? The following example isn't the most efficient way, but will get you the count for each first 6 characters in the file. You can use
awk
orsed
to output every nth line in your fastq that contains the reads.