Question

What if the number of line in fastq.gz pair files are too different?

0

Entering edit mode

3.9 years ago

br0104 • 0

Please somebody help me,

I have fastq.gz files of three different groups of experiment from NextSeq 550 through BD rhapsody WTA multiplex pipeline.

I tried to analyze those results on SevenBridges using BD WTA pipeline but only one of three groups was successful and other groups have failed.

I asked for support and was told that pair fastq files (R1 and R2) have different lines so I counted line using 'wc -l' function and divided by four.

The number of line was similar in successful group however the other failed group showed big differences.

Does anyone know what to do with fastq.gz files to run the analysis pipeline successfully?

fastq R SevenBridges • 1.5k views

ADD COMMENT • link updated 2.4 years ago by Ram 45k • written 3.9 years ago by br0104 • 0

0

Entering edit mode

pair fastq files (R1 and R2) have different lines

You need match up paired-end reads from two FASTQ files.

ADD REPLY • link 3.9 years ago by shenwei356 8.7k

score 1 · Answer 1 · 2021-09-01

You have to be sure that each mate in forward file has a respective in the reverse file as well. In other words, the number of reads/lines should be the same in both forward and reverse files. Try using fastq-pair prior to run any analysis.

This will remove un-matched reads from your data. In simple words, such reads which do not have a mate in either forward or reverse file will be discarded.