Question

Input vs IgG control for RIP-seq

1

Entering edit mode

4.5 years ago

bosiarek ▴ 10

Hi, I've never done any bioinformatic analysis before and this is my first attempt at analysing some RIP-seq data (just doing differential analysis using Salmon + tximport + DESeq).

I'm a bit confused about how to normalize my samples using input & IgG controls. I've asked two people that I know who have done similar experiments before and they have different suggestions:

1) Subtract input read counts from IP read counts before feeding them into DESeq & do the enrichment analysis of samples vs IgG 2) Subtract IgG read counts from IP samples and then compare with input to get the enrichment information

Right now I'm just analysing a set of data to just check if there is a preferential binding between my protein of interest and a specific group of transcripts but in the future I would like to include a drug treatment as well so then I will have 4 groups of samples (IP-Ctrl, IP-Treated, IP-IgG and inputs) and I'm even more confused as to how to normalize the data in a proper way if I want to understand the difference between Ctrl & Treated. (I was also told once that the IgG control is not really needed in such case but I really want to make sure that I'm doing the analysis correctly).

I would be very grateful for your advice!

RNA-Seq • 4.3k views

ADD COMMENT • link updated 2.2 years ago by jwil14 • 0 • written 4.5 years ago by bosiarek ▴ 10

0

Entering edit mode

Thank you so much for your answer! I apologize for such a late reply but the I was finally able to go back to the lab so the analysis part of the project got a bit delayed.

That's very helpful and confirms what I thought about possibly dropping the IgG when comparing my treated vs untreated samples! I assumed that any non-specific binding would be comparable across the samples so I thought it could be omitted from the analysis. However, I was told by a fellow DIY bioinformatician that I should still include it and that's where my confusion came from.

Once again, thank you for your input (pun intended).

ADD REPLY • link 4.5 years ago by bosiarek ▴ 10

0

Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. This comment belongs under @Ian's answer.

SUBMIT ANSWER is for new answers to original question.

ADD REPLY • link 4.5 years ago by GenoMax 148k

score 3 · Answer 1 · 2020-06-16

Input and IgG control for different things in RIP-seq.

Input controls for how expressed each RNA is (you expect to see more signal from a highly expressed RNA than a lowly expressed one), as well as various sequencing bias' (GC content, % uniqueness etc); whereas IgG controls mainly for the specificity of your antibody and the effectiveness of the pull down.

Subtraction on a linear scale is unlikely to be the correct answer as this sort of data is usually ratiometric in nature, so you want to be dividing (or subtracting on a log scale).

For most purposes input is probably the best control, but you will see some sequences that interact with IgG, and the suspcian will be that if those sequences interact with your IP, it is because of non-specific effects.

For a simple analysis I would put all of the counts as raw counts into DESeq. I would that use DEseq to find transcripts where there is an enrichment in IP over input by effectively doing a DE analysis between IP and input. You could deal with the IgG in two ways - either do the same DE for IgG and input, and disregard any genes in that analysis from the IP analysis, or you could go whole hog and do a differences-in-differences test. I'm usually a fan of differences in differences, but given the likely big difference in read conuts between IP and IgG, I think i'd go for the first option here.

When you are doing your Ctrl and Treated anslysis, I would go for differences-in-differences though. Becuase the non-specific binding should be the same in both treated and ctrl, rather than doing an IgG control here, I would do IP-Cntrl, IP-treated, input-cntrl, input-treated.

I would then use DESeq to look at the interaction term (or coding the contrast (IP.Treated - input.treated) - (IP.cntrl-input.cntrl)). If you don't do this, you risk just identifying genes where the expression has increased, and the binding of the protein is increased because of this, not because the protein actaully binds tighter.