Question

Significantly different number of reads between two paired files when runing kneaddata

0

Entering edit mode

5.2 years ago

zhangdengwei ▴ 210

Hi all,

I am using kneaddata to remove the contaminated reads belonging to the host, and below is my command

nohup kneaddata -i ../01.fastp/dynamics_12/clean_SAMEA2580278_r1.fq.gz -i ../01.fastp/dynamics_12/clean_SAMEA2580278_r2.fq.gz -o ./SAMEA2580278 -db ~/database/Genome/1.Human/02.bowtie2.index/GRCh38 --bypass-trim -t 2 --remove-intermediate-output &

In line with its tutorial, it produced several files, as follows

clean_SAMEA2579907_r1_kneaddata_GRCh38_bowtie2_paired_contam_1.fastq
clean_SAMEA2579907_r1_kneaddata_GRCh38_bowtie2_paired_contam_2.fastq
clean_SAMEA2579907_r1_kneaddata_GRCh38_bowtie2_unmatched_1_contam.fastq
clean_SAMEA2579907_r1_kneaddata_GRCh38_bowtie2_unmatched_2_contam.fastq
clean_SAMEA2579907_r1_kneaddata.log
clean_SAMEA2579907_r1_kneaddata_paired_1.fastq
clean_SAMEA2579907_r1_kneaddata_paired_2.fastq
clean_SAMEA2579907_r1_kneaddata_unmatched_1.fastq
clean_SAMEA2579907_r1_kneaddata_unmatched_2.fastq

However, the reads number of two paired files - clean_SAMEA2579907_r1_kneaddata_paired_1.fastq and clean_SAMEA2579907_r1_kneaddata_paired_2.fastq - differed significantly. They should be the same. On the other hand, the two files are same in reads number when processing another sample. I am certain that there are neither errors nor warnings, so what happened? Any suggestions would be greatly appreciated.

Cheers

kneaddata metagenomics remove host reads • 1.6k views

ADD COMMENT • link updated 5.0 years ago by Biostar 20 • written 5.2 years ago by zhangdengwei ▴ 210

0

Entering edit mode

How did you count the number of reads?

ADD REPLY • link 5.2 years ago by ATpoint 88k

0

Entering edit mode

Thanks @ATpoint. Maybe I found why it occurred. It might be due to the title for each read. Here is an example for one read in the original FASTQ file,

@ERR525690.1001 1001/1

If I remove the space within the title, like @ERR525690.10011001/1, then run kneaddata, it worked well. I suppose this might be a bug for kneaddata, although I did not review its raw code carefully.

ADD REPLY • link 5.2 years ago by zhangdengwei ▴ 210