Can anybody suggest tools to identify SSRs for reference based assembly? Can MISA be used in that case?
Can anybody suggest tools to identify SSRs for reference based assembly? Can MISA be used in that case?
RepeatAnalyzer: a tool for analysing and managing short-sequence repeat data https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-016-2686-2
In the paper they mention Tandem Repeats Finder, scan_for_matches and the ALGGEN software suite.
I realized that algorithms to identifying SSRs work on simple pattern matching; as simple as regex prgramming. Hence, it does not matter much when you know what fasta sequence are you supplying to your favourite SSR finding program. MISA will work as good as on consensus sequence as on assembled genome.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
So the point is:
My consensus sequence have 'N's. Before I can run any tool for SSR identification, shall I remove the 'N's (I guess yes).
OR
Shall I fill the gaps (N's) in the consensus sequence with any gap filler tool?