Question

Observed and expected fastq header count different when merging two fastq files

0

Entering edit mode

9.7 years ago

nalandaatmi ▴ 110

Hi All,

Recently I concatenated two fastq files from one library belonging to the same sample but loaded in different lanes, using the following command.

cat Sample01_L001_R1.all.fastq.gz Sample01_L006_R1.all.fastq.gz > Sample01_L001-6_R1.all.fastq.gz

Sample fastq file content:

@HISEQ:137:C8W59ACXX:1:1101:1183:2157 1:N:0:TATGGC
GTATCATTAAAACTTTTACGATCAATCTTTTTAATAAGAACTAAATTATAATAAAATCCATATGTTGCCACAGGCGGGAAAAAAAAAAGGAAGGAAAAAAA
+
BBBFFFFFFFFFFBIIIIIIFFIIIIIIIFFIFFF<FBBFFFIIIIIIIIFIIIFFIBFFFIIB<BF<FFFIBFFBBB'7BBB<7<BF7<'077'07BB7<

To validate that all the lines have been copied to the output fastq file Sample01_L001-6_R1.all.fastq.gz. I counted the number of lines in each fastq file using following command.

$ zcat Sample01_L001_R1.all.fastq.gz | grep '@HISEQ:137' | wc -l
37,955,286

$ zcat Sample01_L006_R1.all.fastq.gz | grep '@HISEQ:137' | wc -l
18,385,272

$ zcat Sample01_L001-6_R1.all.fastq.gz | grep '@HISEQ:137' | wc -l
55,587,340

Expected count should be 56,340,558.

Why the number of fastq header count is different from the expected?

RNA-Seq next-gen fastq • 2.3k views

ADD COMMENT • link updated 3.0 years ago by Ram 45k • written 9.7 years ago by nalandaatmi ▴ 110

1

Entering edit mode

hard to say - do it again, I agree that the counts should be the same, it is possible that the file has been corrupted in some manner

ADD REPLY • link 9.7 years ago by Istvan Albert 102k

0

Entering edit mode

Thanks Istvan Albert. I will try it again.

ADD REPLY • link 9.7 years ago by nalandaatmi ▴ 110

0

Entering edit mode

Have you tried counting all lines and dividing by 4 to see what number you get?

To be safe you can also try this instead:

$ zcat seq1.fq.gz seq2.fq.gz | gzip -c > all.fq.gz

ADD REPLY • link 9.7 years ago by GenoMax 152k

0

Entering edit mode

what are those numbers without the grep (all lines) ?

ADD REPLY • link 9.7 years ago by Pierre Lindenbaum 166k

0

Entering edit mode

I know this isn't answering your question directly, but you can just supply your fastq files to the aligner, most aligners can merge them for you. I know that STAR does that for sure.

ADD REPLY • link 9.7 years ago by Kirill Tsyganov ▴ 370