Annotation of repeatative elements
1
0
Entering edit mode
8.2 years ago
Rahul ▴ 30

Hello,

  1. May I know what is the Repbase (Censor) ideal cut off/optimum score for the alignment.
  2. Can we use Repbase Censor instead of Repeatmasker http://www.repeatmasker.org/
  3. minimum length of Alignment

I would be very grateful for any comments.

repbase censor repeatmasker • 2.5k views
ADD COMMENT
1
Entering edit mode

We need more information on what you're trying to do to answer this properly.

However, for 2), RepeatMasker significantly outperforms Censor. Unless you have a reason not to use it, RepeatMasker is basically the de facto standard.

ADD REPLY
0
Entering edit mode

Dear sir, Thank you very much for your reply.I have assembled the transcriptome of a non-model plant with de Novo assembly method using Trinity, CD-HiT-EST, CAP3. Right now I am looking for the Transposable elements mining in transcriptome data (Including small chunks of TE in protein coding genes).I thought of use of Repbase censor for minning but not getting the optimum parameters like cut-offf score and length for the alignmnet.

I will be thankful if could make any coments...

Regards Rahul

ADD REPLY
1
Entering edit mode
8.2 years ago
Amitm ★ 2.3k

hi, Your question doesn't tell what repeat are you looking into nor what is the source sequence. Is it TEs or is it simple repeats. I have some experience with the former. In case this is the human or some other well studied model organism, pre-annotated repeat information might (must) be available from RepeatMasker or UCSC Genome Browser site.

In case your organism is not covered, check what repeat is in question. For repeats like TEs, they have canonical lengths. This info. can be obtained from GIRI (after free reg.). Probably you have it already.

I have used RepeatMasker and it adjudges whether a given sequence is of repeat origin or not based upon its own thresholds (you can of course use a sensitive flag), but that is it. You need not bother more, like whether alignment length was sufficient or not.

In case you are checking transcribed sequences, or genomic units like exons or gene locus, which can have fraction of the genomic repeat element present, 'length' can be a consideration. In case of exons, to assign it as repeat harbouring, I used threshold of 10% of exon length or >25bp.

Lastly, things might be different if you are looking into simple repeats. My experience is with TEs

ADD COMMENT
0
Entering edit mode

Dear Sir, I have assembled the transcriptome of a non-model plant with de Novo assembly method using Trinity, CD-HiT-est, CAP3.I have also done with annotation and want to extend my study regarding transposable elements. SSR mining was done with MISA programme. Right now I am trying to find the transposable elements which are present in transcriptome data and for that I thought of using Repbase censor ,but because I could not able to find the cut-off scores and length ,I was a bit confuse to choose the right parameters. In the case of repeat masker in most of the paper, the cutoff score is mentioned which around 250-300 (RM Score). I would be very grateful to you if you could able to help me in this regard.

Sincerely Rahul

ADD REPLY

Login before adding your answer.

Traffic: 2494 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6