Hi everybody, I have a .misa file and i want to calculate abundace of classes of ssrs from that misa file, for example in dinucelotide repeats which one is repeating maximum time and same in tri, tetre,penta and hexanucleotides. is there any software or any script which can count this?
Thanks in advace for the help.
Do you want the abundance of repeat sequence or abundance of mono, di, tri, tetra etc repeats (not sequences just length of repeat)?
If you are interested in the abundance of the length of repeats then it's pretty simple. There is one column
SSR type
in whichp1
means mono-nucleotide repeatp2
means di-nucleotide and so on. So open the file in excel and make a pivot table ofSSR type
column you will get repeat lengthwise abundance.