Entering edit mode
7.4 years ago
bitjunkie
▴
40
I already know that you can use the first and second in pair information from the sam FLAG, but it works only if you have one pair of paired-read files. I need to do this for multiple pairs and/or single end reads. Any one know an efficient work around? Thanks!
The read group ID sometimes encodes which fastq file the read came from. Check how many read group IDs you have and see if it matches the number of fastq files you expect.
You could look at the lane numbers and flowcell ID's encoded in the fastq header to identify where a particular read came from. If the reads are paired or not can only be figured out if the aligner kept the part after the space in fastq header which denotes read 1/2 along with the index sequence.