Hi Guys
Just wondering if there is a tool out there that is able to de-novo pattern discovery given the NGS reads as input (mainly from Illumina).
We are trying to find if any particular sequence is enriched which may be part of linker/adaptor spill from library creation step. We are usually able to map the reads back to such sequences but sometimes if the contamination sequence is not in the dbase we are searching against, we can miss a possible contaminant.
One can do something similar by hasing full or kmers of reads but I was just wondering if there is already a tool out there that does this in a slick way.
-Abhi