Hi everyone! I'm quite new to NGS field, I'm working at the moment with 16S rRNA sequencing on Ion Torrent and I am trying to find a way to analyze my data. Everything is going +/- ok, but during alignment and taxonomic classification in Mothur I recieve many notes about my sequences that look like that:
> 1read-161813 is bad. It has no kmers of length 8. [WARNING]:
> 1read-161813 could not be classified. You can use the remove.lineage
> command with taxon=unknown; to remove such sequences.
And for one particular sample, due to this error, "unclassified" turned out to be 68 000 reads out of 160 000, which seems to me like a lot.
I've searched the internet to understand what is kmer but not sure i understand it completely. Is here anyone who could try to explain to me what is going on? >.< Can I just remove these sequences? Or should I change the kmer length from 8 to, say, 6 and try again?
Thank you!!
k-mer entry at WikiPedia.
You do know many specific k-mers: an hexamer is a k-mer of length 6, a dimer is a k-mer of length 2. Etc.
dimer, trimer, pentamer, hexamer, septamer, octomer for sure. nonomer? decamer?
also read Oligonucleotide Vs K-Mer - one of my favorite biostar questions