Associate peak score to gene
1
0
Entering edit mode
9.2 years ago
RT ▴ 10

Hi all,

I need some suggestions on how to associate a score to nearest gene from peak score. I have elongating form of Pol2, and almost all the peaks (using macs2) are contained inside gene-body. I use bedtools intersect to annotate the peaks to genes. However, I am not sure how to proceed to transfer this score on to genes.

We want to find the transcription rate from the peaks which in our data will always be within the gene-body due Ser2-5 phosphorylation, should I still use the distance to peak center to calculate the gene-wise score? It will also be helpful if you can direct me to any protocol papers.

Thanks,
Aarthi

Pol2 ChIP-Seq • 3.6k views
ADD COMMENT
1
Entering edit mode
9.1 years ago
Fidel ★ 2.0k

If I understand correctly you want to associate transcription rate with the read counts after ChIP-seq from PolII Ser5/2 phosphorylation. I think this will not give you the desired results because:

  1. Elongating Poll II speed is not constant and may even stop at some positions, which in turn translates into higher read counts at those positions.
  2. Gene length inversely correlates with the amount of elongating PolII. Larger genes have on average less PolII over the gene body
  3. Regions at overlapping genes have a mixture of Pol II signals, one for each gene.
  4. Elongating PollII creates broad peaks over the gene body that require a higher depth of sequencing to accurately identify them. MACS can miss many elongating PolII broad peaks when few reads are used.

I don't think you need to call peaks in this case because, as you say, any enrichment will be found at gene bodies. Rather you can try to get an average of the PolII (preferable the log of chip vs. input) over the gene body of all the annotated genes and try to cluster those values. At least you should be able to distinguish active vs. inactive genes. To measure transcription rates the method commonly used is GRO-seq.

ADD COMMENT
0
Entering edit mode

Thanks for the insights and ideas Fidel, I will definitely try to make a heatmap instead of just profile plots from now on.

Most of our profile plots are always biased towards the 3' end than 5'. I did try stratifying based on gene length to see the difference, but I still find that the PolII is preferentially higher in the 3' end regardless of gene lengths for my case.

We did use MACS2 (since none of the other broad peak callers were fruitful) for both narrow and broad peaks, and surprisingly we find lot more narrow than broad peaks with this data. I am not sure if this is due to depth, but we have a average of 20X coverage for 23MB genome.

Thanks,
Aarthi

ADD REPLY
0
Entering edit mode

Thanks Fidel!

ADD REPLY

Login before adding your answer.

Traffic: 2080 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6