Hi everyone
I’m working on a project where I need to count the number of transposable element on each human chromosome from different family of TE. The problem if that I use the data from the UCSC repeatMasker and the classification of TE is done by family. What I would like to do is to class them by percentage identity to a consensus sequence.
The reason behind it is that TE from a family or super family has to many incomplete TE or TE that are too different from the original sequence.
I already know where to find the consensus sequence for any family of TE (on repbase update).
I tried to use Visual Repbase, without success. If anyone has a solution I would be very grateful.
Thank you.
Thanks ! That's what I was thinking but RepeatMasker runs on UNIX system and I never used it before so I'm still learning how to run it properly.