Question

Forward and reverse files with different number of lines after sortmeRNA

0

Entering edit mode

4.9 years ago

tianshenbio ▴ 180

I performed sortmeRNA using the following command:

sortmerna --ref ref_files --reads input.fq --aligned output_rRNA --other output_clean --log -a 12 -v --paired_in --fastx

After unmerging "clean", I got "cleanFV" and "cleanRV". I assume that FV and RV files should have the same number of lines since they are paired, but when I checked it using wc -l, they are of the different number of lines. However, FastQC showed the same number of reads for the two files, and the sum of FV, RV and rRNA read number equals to the total number of reads of my input file. Can anyone explain why I got different number of lines but the same number of reads?

RNA-Seq sortmerna • 1.5k views

ADD COMMENT • link updated 4.9 years ago by GenoMax 147k • written 4.9 years ago by tianshenbio ▴ 180

score 0 · Answer 1 · 2020-01-05

0

Entering edit mode

4.9 years ago

GenoMax 147k

You can pass the reads through repair.sh from BBMap suite to ensure reads are in sync in clean files.

repair.sh in1=r1.fq in2=r2.fq out1=fixed1.fq out2=fixed2.fq outsingle=singletons.fq

Looks like this issue has been reported with sortmerna before: How to repair corrupted fastq files after sortmeRNA

Use the method above.

ADD COMMENT • link 4.9 years ago by GenoMax 147k

0

Entering edit mode

Hi, I don't think my data corrupted since my FV.fq.gz has 8078406 lines but my RV.fq.gz has 8058719 lines, they are quite different...and almost all of my samples are like that.

Also, why they still have the same number of reads? Another thing I noticed is that the original FV and RV files (after trimmomatic) I merged to feed sortmerna are also of different line numbers...again, they have the same number of reads shown in fastqc reports.

ADD REPLY • link 4.9 years ago by tianshenbio ▴ 180

0

Entering edit mode

Do not trust wc -l counts. Either run the repair.sh tool I posted above or use a Fastq file validation program like validateFiles from Jim Kent's utils to see where the problem is. Add execute permission chmod u+x validateFiles after you download.

ADD REPLY • link 4.9 years ago by GenoMax 147k

0

Entering edit mode

Thank you for your suggestion, I am running repair.sh to figure out

ADD REPLY • link 4.9 years ago by tianshenbio ▴ 180

0

Entering edit mode

I got an empty file for the singleton, so everything seems fine.

ADD REPLY • link 4.9 years ago by tianshenbio ▴ 180