Question

CLIP enrichment analysis

0

Entering edit mode

4.2 years ago

noahhelton98 ▴ 80

Hi there, I am analyzing a dataset from another paper and am trying to recreate one their methods on CLIP Seq. I have the bam files and in their methods they claimed:

'For each RNA, we enumerated 100 nucleotide windows across the entire RNA. For each window, we calculated the enrichment by computing the number of reads overlapping the window in the protein elution sample divided by the total number of reads within the protein elution sample. We normalized this ratio by the number of reads in the input sample divided by the total number of reads in the input sample.'

I have two questions with this :

First, when they say they protein elution sample, should I be taking the total number of reads within the entire BAM file? Or just the ones reads within the RNA I am looking at?

Second, what methods are used for normalization of CLIP data? Is there an efficient way to do this in python? I created my sliding window and plots of the reads in python so I would like to stick with it to try and completely recreate this. Unfortunately, the paper didn't link any github or source code - so this is all I really have.

Thanks so much.

ChIP-Seq • 861 views

ADD COMMENT • link updated 4.2 years ago by i.sudbery 21k • written 4.2 years ago by noahhelton98 ▴ 80

score 1 · Answer 1 · 2021-03-03

1

Entering edit mode

4.2 years ago

i.sudbery 21k

Its not possible to tell from their description, but I would guess the normalization is applied separately for each RNA. That what I have done when I've done iCLIP.

There is as yet, no standard/consensus on the analysis of PAR/i/eCLIP data, and I suspect the ideal approach depends on the sitaution and goal of analysis.

If its any help, I have written a few python functions for dealing with CLIP-like data. Its not the best documented in the world, and its not published, but if you find it helpful, you can look at www.github.com/sudlab/iCLIPlib .