Hi, everyone!
k-tuple frequency can be used in clustering sequences.
I wonder if this method can be used when target sequences are very short like reads from next generation sequencing? Will it be unstable? (I have read some papers and all of them just use k-tuple frequency to catalog different meta-data, but not clustering reads in those meta-data.)
Any hints or paper about this would be helpful! Thank you!
Thanks for the reference, Dan (disclaimer --- I'm one of the Sailfish authors)! Another good example of using k-mers with NGS reads would be Kraken, which uses k-mers to quickly and accurately classify NGS metagenomic reads.
Thank you!!
Thank you very much!