Input vs IgG control for RIP-seq
1
1
Entering edit mode
4.5 years ago
bosiarek ▴ 10

Hi, I've never done any bioinformatic analysis before and this is my first attempt at analysing some RIP-seq data (just doing differential analysis using Salmon + tximport + DESeq).

I'm a bit confused about how to normalize my samples using input & IgG controls. I've asked two people that I know who have done similar experiments before and they have different suggestions:

1) Subtract input read counts from IP read counts before feeding them into DESeq & do the enrichment analysis of samples vs IgG 2) Subtract IgG read counts from IP samples and then compare with input to get the enrichment information

Right now I'm just analysing a set of data to just check if there is a preferential binding between my protein of interest and a specific group of transcripts but in the future I would like to include a drug treatment as well so then I will have 4 groups of samples (IP-Ctrl, IP-Treated, IP-IgG and inputs) and I'm even more confused as to how to normalize the data in a proper way if I want to understand the difference between Ctrl & Treated. (I was also told once that the IgG control is not really needed in such case but I really want to make sure that I'm doing the analysis correctly).

I would be very grateful for your advice!

RNA-Seq • 4.3k views
ADD COMMENT
0
Entering edit mode

Thank you so much for your answer! I apologize for such a late reply but the I was finally able to go back to the lab so the analysis part of the project got a bit delayed.

That's very helpful and confirms what I thought about possibly dropping the IgG when comparing my treated vs untreated samples! I assumed that any non-specific binding would be comparable across the samples so I thought it could be omitted from the analysis. However, I was told by a fellow DIY bioinformatician that I should still include it and that's where my confusion came from.

Once again, thank you for your input (pun intended).

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. This comment belongs under @Ian's answer.

SUBMIT ANSWER is for new answers to original question.

ADD REPLY
3
Entering edit mode
4.5 years ago

Input and IgG control for different things in RIP-seq.

Input controls for how expressed each RNA is (you expect to see more signal from a highly expressed RNA than a lowly expressed one), as well as various sequencing bias' (GC content, % uniqueness etc); whereas IgG controls mainly for the specificity of your antibody and the effectiveness of the pull down.

Subtraction on a linear scale is unlikely to be the correct answer as this sort of data is usually ratiometric in nature, so you want to be dividing (or subtracting on a log scale).

For most purposes input is probably the best control, but you will see some sequences that interact with IgG, and the suspcian will be that if those sequences interact with your IP, it is because of non-specific effects.

For a simple analysis I would put all of the counts as raw counts into DESeq. I would that use DEseq to find transcripts where there is an enrichment in IP over input by effectively doing a DE analysis between IP and input. You could deal with the IgG in two ways - either do the same DE for IgG and input, and disregard any genes in that analysis from the IP analysis, or you could go whole hog and do a differences-in-differences test. I'm usually a fan of differences in differences, but given the likely big difference in read conuts between IP and IgG, I think i'd go for the first option here.

When you are doing your Ctrl and Treated anslysis, I would go for differences-in-differences though. Becuase the non-specific binding should be the same in both treated and ctrl, rather than doing an IgG control here, I would do IP-Cntrl, IP-treated, input-cntrl, input-treated.

I would then use DESeq to look at the interaction term (or coding the contrast (IP.Treated - input.treated) - (IP.cntrl-input.cntrl)). If you don't do this, you risk just identifying genes where the expression has increased, and the binding of the protein is increased because of this, not because the protein actaully binds tighter.

ADD COMMENT
0
Entering edit mode

Thank you for your answer, it's been very useful. I was hoping to ask you for some additional help. I also have a RIP-seq experiment with three assays - IP, input and IgG non-specific antibody, and two conditions - A and B.

So far I have compared input-A vs IP-A (filtering q<0.05 and FC 1.2, let's call this real-IP-A) and input-A vs IgG-A (filtering q<0.05 and FC 1.2, let's call this real-IgG-A). Then I compare real-IP-A vs. real-IP-B (filtering q<0.05 and FC 1.5), and I compare real-IgG-A vs. real-IgG-B. Any genes in the IgG list I remove from the IP list (there's quite a few to remove).

Do you think this is the best way to handle the data, or should I be doing a differences-in-differences as you mentioned above? In my case, the non-specific binding is not the same in A and B. In B, I appear to pick up more non-specific RNA.

Also, do you think that a FC cut off of 1.2 is appropriate when comparing input vs. IP or would you go higher?

I really look forward to your reply, and appreciate your help in advance.

Best wishes,

Jo

ADD REPLY

Login before adding your answer.

Traffic: 1742 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6