I used the nf-core Ampliseq pipeline to analyze 16S metagenomic sequencing data. The lab spiked in a normal control, and it doesn't appear in the DADA2 results.
I blasted the raw reads against the reference genome that was spiked-in and confirmed that the 16S regions are there in the raw data (tons of hits).
I'm not sure where to go from here to figure out why they are not being picked up by DADA2. I checked Silva 138 database and there are ~700 genomes of that spike-in control indexed in the database.
Does anyone have suggestions as to what I can try to determine why the spike-in control isn't showing up in the ASV tables?
Was the spike-in a commercial product, e.g., from Zymo? Can you provide more information about what cells or DNA was spiked into your samples? What do you mean by not picked up by DADA2 (those genera or species were not found among your annotated sequence features)? What hypervariable region did you sequence?
It's a commercial product, a Staphylococcus hominis genome. This species is not found in the DADA2 results, so I guess DADA2 is not identifying ASVs in the cleaned reads. There are ~700 S. hominis genomes in SILVA138. I also identified all the 16S regions from the reference S. hominis genome in my raw reads.