Mapping to repeat consensus sequences?
0
2
Entering edit mode
6.1 years ago
a.rex ▴ 350

I have ChIP-seq (H3K9me3) and RNA-seq data that I wish to map to transposons. I have identified the transposons in my non-model species genome using RepeatModeler and RepeatMasker. The output of RepeatModeler confuses me though - I have, for example, multiple LTR/Gypsy families in the consensi.fa.classified file (annotated as rnd-x_family-x; where x is a number). What does this mean?

How can I obtain consensus sequences for LTRs or specific families of LTRs? In the repeatmodeler consensi.fa.classified file I have sequences corresponding to rnd-x_family-x - what does this mean? Can I simply map to these sequences?

RepeatModeler RepeatMasker • 2.2k views
ADD COMMENT
1
Entering edit mode

Repeat modeler finds the repeated sequences in the genome. It gives them a number (your x), and compares them to repbase. If the sequence has a hit, it is annotated with the hit in repbase.

How can I obtain consensus sequences for LTRs or specific families of LTRs?

The consensi.fa file already contains the consensus sequences of the family. Repeat modeler uses each copy to build the consensus. Also, you can't get the consensus of all LTRs as they are totally different.

In the repeatmodeler consensi.fa.classified file I have sequences corresponding to rnd-x_family-x - what does this mean?

I don't get this question.

Can I simply map to these sequences?

I would not advise you to do it. You should map the ChIP seq to the genome, localise your transposable element sequences on the genome (with repeat masker, using consensi.fa base) and look if you have some peaks on TEs.

ADD REPLY
0
Entering edit mode

If for example I want to compare H3K9me3 density between LTRs and LINES. From the RepeatMasker.gff output, I could look at total reads that map to annotated LTRs versus LINES? This would be reads normalised to background (input).

ADD REPLY
0
Entering edit mode

I am not a ChIP seq specialist. But my personal point of view would be to do the peak calling, and then compare to the repeatmasker gff.

ADD REPLY
0
Entering edit mode

If for example I want to compare H3K9me3 density between LTRs and LINES. From the RepeatMasker.gff output, I could look at total reads that map to annotated LTRs versus LINES? This would be reads normalised to background (input).

ADD REPLY

Login before adding your answer.

Traffic: 2619 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6