I'm working on a task in plants bioinformatics to prove the hypothesis assumes that genes containing SSRs are one of the sources that defect existing or generate new genes, and I've been wondering what tools can I use to perform a plant Microsatellite detection ?
and what type of inputs/information do I need to use them ?
You can try running censor to find the microsatellite information. The results are written to output file with .map extension. Since you are in need to check only the plant microsatellites, you can use -lib by changing it to plnrep.ref.
EDIT: -lib here means library you wanted to search for. The plant repeat database is searched by censor, if you put -lib plnrep.ref in your censor command. (Default running of censor will search for all the databases including humans and other organisms which you did not intend.) But, I honestly do not know if the microsatellites are specific to an organism as pgibas pointed.
Regarding your note about whether Microsatellites are organism specific; from my humble knowledge in this field I can say that they are abundant in all organisms but differs in structure and functionality.
Any repeat expert to approve this information?!
ADD REPLY
• link
updated 4.9 years ago by
Ram
44k
•
written 10.5 years ago by
Bara'a
▴
270
1
Entering edit mode
May be try posting it as separate question instead of asking in comment. You may get the clear answer from users.
Why not Tandem repeat finder? I guess it is one of the oldest and most used tool to identify tandemly repeated sequences (including microsatellites). I use it a lot and don't need anything else.
Input is sequences in fasta format. You can use one sequence or fasta file with multiple sequences; you can run it using genome as input or seperate gene sequences. Output is informative and easy to parse. There are many options to play around. For example, "maximum period size". If it is "1", then you'll get only AAAAAA; if it is "2", then you'll get AAAAA and CGCGCGC.
True. But, I think TRF is a Denovo repeat finder. Even if TRF finds repeats, the user should still needs to rely on existing database to verify if it is plant specific repeat or not.
Are microsatellites so specific? I mean, long satellites can be species specific, but short satellites (like ACAC..) should be more global thing. Isn't it? I am just guessing.
Nonetheless, if OP is just starting, TRF might be a good start to play around with sequences and bioinformatic data.
Thanks a lot Pgibas... I appreciate your reply , can't thank you enough .
But; what about the note Prakki Rama mentioned ?
Can I rely on TRF alone to get started with my data analysis or not ? if so; would any repeat expert in the forum guide me to the right direction please ?
check my edit.
Thank you; this is a way better answer :)
Regarding your note about whether Microsatellites are organism specific; from my humble knowledge in this field I can say that they are abundant in all organisms but differs in structure and functionality.
Any repeat expert to approve this information?!
May be try posting it as separate question instead of asking in comment. You may get the clear answer from users.