Question

Track file for the DADA2 pipelne

0

Entering edit mode

8 months ago

naveedhasan2000 ▴ 10

input filtered denoisedF denoisedR merged nonchim
- AH1 162998 95002 94125 94312 309 309
- AK 171459 104653 103108 103775 92 92
- AW5 126727 83146 81141 81717 38 38
- WM 177976 107287 106411 106575 123 123

This is my track file of the reads over various filtering steps in dada2, as you can see that theres a considerable loss in the data. My filtering parameters are -

 out <- filterAndTrim(fnFs, filtFs, fnRs, filtRs, truncLen=c(220,180),
                     maxN=0, maxEE=c(2,2), truncQ=2, trimLeft=c(10,10),
                     rm.phix=TRUE,
                     compress=TRUE, multithread=T)

my reads are 350bp and the quality profile of forward and reverse reads are - enter image description here

enter image description here

What am i doing wrong?

microbiome-analysis metagenomics dada2 16s-amplicon • 612 views

ADD COMMENT • link 8 months ago by naveedhasan2000 ▴ 10

score 1 · Answer 1 · 2024-09-22

Okay, there are many factors to consider while doing denoising using DADA2

Length of targeted region: First you have to know which region of 16S rRNA you are targetting. For example, if you are using V3-V5 (~430 bp). Here you are using 300X2 bp chemistry, that means you can filter our 600-430=170 bp from total data. So that your total length of sequences after filtering and merging should be ~400-430 bps.
Quality of reads: Depending on your quality graphs, I would suggest filtering criteria like this, trimLeft=15 or 20 for both reads, truncLen=220 to 250 for forward and 150 to 200 bp for reverse reads. Find if there are 'N' characters in your reads if there are any try to increase it upto 5-10 bp. Again the maxEE, you can try increasing it further upto 5 bp.
Expected length of merging region: If your expected length of targeted region is 430 bp, then you can filter out 170 bp from forward and reverse exactly. But some region is also required for merging, by default in dada2 mergPairs function, minOverlap=20 bp, try decreasing it further upto 10 bp, but not further.

You will have to try different combinations between these parameters. These parameters need to be changed for every data, especially low quality data, like in your case. If none works the final solution remains is you merge reads outside DADA2 using programs like PEAR and then follow DADA2 as single reads. But the risk in this case is merging algorithms like PEAR are error-tolerant algorithms. You will observe low quality bp in middle of every reads.