Question

Chipseq analysis on repeat genes

2

Entering edit mode

14 months ago

pb11 ▴ 30

Hello everyone

I am in need of some tips to analyze and make heatmaps as shown in this figure C and G. I already have downloaded the repeatmasker list. I want to compare my chromatin data with repeat elements and see their enrichment.

enter image description here

I am not sure how to consider the summit point, since repeat elements are spread all over the genome and may not be possible to capture in the right format. Any help/suggestion will be grateful.

Thanks
PB

galaxy repeatmasker ChIP-seq repeat-elements ATAC-seq • 2.0k views

ADD COMMENT • link updated 14 months ago by rfran010 ★ 1.3k • written 14 months ago by pb11 ▴ 30

0

Entering edit mode

"I want to compare my chromatin data with repeat elements and see their enrichment."

Can you elaborate on exactly the question you are trying to answer with a figure? As ATpoint mentioned, assuming you've taken an alignment approach that survives the repeat mapping problem, what are you comparing to what? Chromatin data in heatmaps is often defined around a biological feature. Do you have a biological feature of interest? Is a functional biological mark relevant to your question? What kinds of repeats are you referring to (there are many many different kinds, with different characteristics). Is the data in the figure above just an example of a heatmap, or is this figure specifically relevant to your analysis? Is so, what paper is it from? When you say "compare my chromatin data" I would ask how many questions do you have? State each one in a sentence, and go from there.

ADD REPLY • link 14 months ago by seidel 11k

0

Entering edit mode

Do you have a biological feature of interest?

Yes, we are looking from enhancer perspective

Is a functional biological mark relevant to your question?

Yes, we see enrichment of enhancer H3K27ac marks on ERV loci's. What kinds of repeats are you referring to (there are many many different kinds, with different characteristics). LINE, SINES, ERVs

Is the data in the figure above just an example of a heatmap, or is this figure specifically relevant to your analysis?

Is so, what paper is it from? You can see the paper here. https://academic.oup.com/nar/article/51/10/4745/7067945

When you say "compare my chromatin data" I would ask how many questions do you have?

I have lot of questions :P.

ADD REPLY • link updated 14 months ago by Ram 44k • written 14 months ago by pb11 ▴ 30

0

Entering edit mode

Yeah, I get that. I was wondering how to make these plots. I have made similar plots for genes, I am not sure how to make for repeat elements, considering they are spread randomly.

ADD REPLY • link 14 months ago by pb11 ▴ 30

0

Entering edit mode

Use your chip-seq (or atac-seq) peaks that are enriched with TE/repeats you are interested in.

Basic way to do that is to take your peaks and use bedtools to keep the ones that are overlapped by the repeatmasker annotations. You could use other "enriched" thresholds as well, but overall you will be left with your peaks enriched with repeat elements. The repeat element analysis is usually repeatmasker based.

I'm not sure if there was something else you thought was different about this analysis from what you've done before?

ADD REPLY • link 14 months ago by rfran010 ★ 1.3k

0

Entering edit mode

I tried that method and I am unable to capture that. I see in my RNAseq that repeat elements are enriched and we I see the chipseq marks over them I can see they are bounded by H3K27ac. I just want to build aggregate plots and heatmaps to see them.

ADD REPLY • link 14 months ago by pb11 ▴ 30

0

Entering edit mode

Unable to capture what?

If you see you have H3K27ac over repeats, then when you derive the H3K27ac peaks that are overlapped by the repeat you want, you will see the heatmap/metaplot of enrichment. This must be true since you are pre-defining H3K27ac peaks.

ADD REPLY • link 14 months ago by rfran010 ★ 1.3k

0

Entering edit mode

You should check the methods. It looks like they may have centered on peaks that are overlapped with repeat element, or have centered on the LTR regions. I'm guessing the former since they show an internal region, and I can't imagine you get such nice k27ac enhancer peaks by basing the summit of any sort of annotation and not the peaks. I wouldn't be surprised if they even specifically identify certain subsets of peaks.

When doing your own repeat analysis, I think it's important to question your assumptions. For example, the obvious ones are if you keep multimappers and you see signal over an element, it may or may not be real. Likewise, if you only keep unique mappers, then signal will be depleted. What does that mean for your comparisions?

I like the idea of keeping one multimapper randomly (Default bowtie2 behavior I believe) for TE analysis. Just be careful, because if you subset a family of TEs to compare them, then that may not be representative. Also, it's hard to compare between different TEs. For example, LTR8 and MER50 show different k9me3 signature, but that might be due to the nature of the repeats rather than a real event. Comparing the same TE family between conditions should be okay though, just keep in mind to keep your analysis more global since individual loci may be falsely mapped, but those reads would be expected to stay within the family, if that makes any sense....

Another consideration would be to analyze the flanks of your repetitive regions since chromatin may spread out into uniquely mapped region.

Also it's not clear what regions are represented in G? but it looks like they identified MER50 regions shared or specific to cell types, whats the enrichment in E and F?

As one more consideration, what if in G, the signal over those hTSC specific regions is actually just from one highly enrichement element, which then falsely gets distrbuted to all the other identical elements (it could be thousands!)

ADD REPLY • link 14 months ago by rfran010 ★ 1.3k

0

Entering edit mode

From the figure captions, it is clear they center their data on their enhancer regions

(C) Alteration of H3K27ac and H3K9me3 occupancy between hTSC and STB for the STB enhancers overlapping the top 5 ERV families enriched surrounding STB-specific genes.

(G) Heatmaps showing the intensity of various histone modifications flanking different groups of MER50-associated enhancers.

ADD REPLY • link 14 months ago by rfran010 ★ 1.3k