One of the top features in our 16S sequencing is very suspicious. It was not even classified to domain level using sklearn in QIIME2 on the Silva 132 database. When I blast the sequence there are no hits except with quite low scores, ~48, and high E-vales. It is the exact length of the other, assignable features.
Any ideas what kind of artifact this could be? Or how we can look more into it?
Here is the sequence TAAAAGAGTTGCTGGCGAACGAAGGGCGTCCCGAAGCGACAGTAGATCCTGTTATCGAGGAAATATCGGCAACCGCCGTGGGCCTTACCTTCAAGATCGTAGAGGGTCCCCGGTTCAGAGTCGCCGACATAGAGTTTGAAGGCAACACGGTCTTCTCGAGTTCTCACCTCAGAAAGAACATGAAGCTTGTGAAGAAAGTGGGTCTCCTGACCACCTTCAGCTCGAAGGACATCTACCACAAAGAAAAATTCGAAGCTGATCTTGATCGCCTCCGCGTGCTGGTATACGCCGATCAAGGCTATTTGAAAGCGAGGTTTGGAGAACCGCGAGTCGAGGAAGTAGGCAAAATCGGCAACTGGCTGCCGATCATCGGACACAAGGGCGAGGGGCTGAAGATCGTTGTTCCCGTCGAGGAAGGCCGCCAGTACCGCGCCGCAGAGGTGAAGGTTGAAGACAACACGGAGTTTACGGCCGAGGAAATCAAGGCAATCATTGGTCTCA
Edit: replaced sequence
Sorry, I placed the wrong sequence, here is the correct one
TAAAAGAGTTGCTGGCGAACGAAGGGCGTCCCGAAGCGACAGTAGATCCTGTTATCGAGGAAATATCGGCAACCGCCGTGGGCCTTACCTTCAAGATCGTAGAGGGTCCCCGGTTCAGAGTCGCCGACATAGAGTTTGAAGGCAACACGGTCTTCTCGAGTTCTCACCTCAGAAAGAACATGAAGCTTGTGAAGAAAGTGGGTCTCCTGACCACCTTCAGCTCGAAGGACATCTACCACAAAGAAAAATTCGAAGCTGATCTTGATCGCCTCCGCGTGCTGGTATACGCCGATCAAGGCTATTTGAAAGCGAGGTTTGGAGAACCGCGAGTCGAGGAAGTAGGCAAAATCGGCAACTGGCTGCCGATCATCGGACACAAGGGCGAGGGGCTGAAGATCGTTGTTCCCGTCGAGGAAGGCCGCCAGTACCGCGCCGCAGAGGTGAAGGTTGAAGACAACACGGAGTTTACGGCCGAGGAAATCAAGGCAATCATTGGTCTCA
Blasting that with megablast don't have any significant hit, using blastn it gives the short hits that you described, so I guess is just contamination.
Please use
ADD COMMENT/ADD REPLY
when responding to existing posts to keep threads logically organized.SUBMIT ANSWER
is for new answers to original question.