Question

How to identify SSR (Simple Sequence Repeats) for reference based assembly?

0

Entering edit mode

8.0 years ago

lakhujanivijay 5.9k

Can anybody suggest tools to identify SSRs for reference based assembly? Can MISA be used in that case?

SSR • 1.9k views

ADD COMMENT • link 8.0 years ago by lakhujanivijay 5.9k

score 0 · Answer 1 · 2016-11-21

0

Entering edit mode

8.0 years ago

Tonor ▴ 480

RepeatAnalyzer: a tool for analysing and managing short-sequence repeat data https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-016-2686-2

In the paper they mention Tandem Repeats Finder, scan_for_matches and the ALGGEN software suite.

ADD COMMENT • link 8.0 years ago by Tonor ▴ 480

0

Entering edit mode

So the point is:

I have mapped HQ reads on my reference genome.
Then I have called the consensus for the same.

My consensus sequence have 'N's. Before I can run any tool for SSR identification, shall I remove the 'N's (I guess yes).

OR

Shall I fill the gaps (N's) in the consensus sequence with any gap filler tool?

ADD REPLY • link 8.0 years ago by lakhujanivijay 5.9k

score 0 · Answer 2 · 2016-12-08

I realized that algorithms to identifying SSRs work on simple pattern matching; as simple as regex prgramming. Hence, it does not matter much when you know what fasta sequence are you supplying to your favourite SSR finding program. MISA will work as good as on consensus sequence as on assembled genome.