Question

What Could Be The Reason For Spliced Alignments In Chip-Seq Data?

0

Entering edit mode

11.9 years ago

Mikael Huss 4.8k

I am looking at a ChIP-seq data set where, for one of the suspected target genes, we see a coverage profile that looks suspiciously like RNA-seq data, i.e. the reads are lining up very regularly along the exons as opposed to the usual peaky profile that one would expect in ChIP-seq. On further inspection, we also find that using TopHat, we find a handful of spliced alignments joining the same two exons in the gene. (Initially we had used a different aligner; this was just for checking the potential artifact I am describing.)

Now, I have heard of genomic DNA contamination in RNA-seq libraries, but I have a harder time figuring out how one can get RNA (or rather cDNA, I suppose) contamination in a ChIP-seq library. Any ideas where this might come from?

chip-seq splicing • 3.7k views

ADD COMMENT • link updated 3.5 years ago by Ram 45k • written 11.9 years ago by Mikael Huss 4.8k

1

Entering edit mode

I have had the same problem, but it is predominantly in the input and not the ChIP-seq data. I have been told that the Taq polymerase used for deep seq library preparation may be able to synthesize a small amount of DNA from an RNA template, and that RNase treatment of the ChIP input DNA is needed. We haven't tested whether this is the case yet.

ADD REPLY • link 11.0 years ago by diane.krause ▴ 10

0

Entering edit mode

Interesting, thanks for the comment!

ADD REPLY • link 11.0 years ago by Mikael Huss 4.8k

0

Entering edit mode

Do you have control channel data? What do these regions look like in those experiments? There are a fair number of edge cases where repetitive sequences might generate such patterns, or nonspecific binding over an interval could occur.

The splice junctions are more interesting / worrying, but maybe you'd start thinking about viral integration events or other transposon-like events. It's not clear what would cause the ChIP enrichment though, at least to me.

ADD REPLY • link 11.9 years ago by matted 7.8k

0

Entering edit mode

There are IgG controls where I haven't looked at these regions yet. Thanks for the suggestion. Yes, I was considering viral integration events, but I am not sure what conclusions to draw from that.

ADD REPLY • link 11.9 years ago by Mikael Huss 4.8k

0

Entering edit mode

Did you ever manage to figure out a solution to this? I have a very similar behaviour in the Arabidopsis ChIP-Seq data that I am currently looking at, the genes that show this are ones that are transcription factors that have known important functions in the tissue we are looking at.

I see this in the sample and the anti-HA control, but not the Input, rows in the image are sample, Input, anti-HA.

I'm also noticing that they don't seem to have the SNPs that are present in the Input.

ADD REPLY • link updated 3.5 years ago by Ram 45k • written 10.9 years ago by simon.pearce ▴ 20

0

Entering edit mode

Not really - we have just assumed that we are dealing with some sort of artifact and disregarded this particular locus. Meanwhile, I have seen and read this paper which might be relevant: Highly expressed loci are vulnerable to misleading ChIP localization of multiple unrelated proteins. I don't think that would explain your "missing SNPs" though. That is an interesting observation which I didn't see in my data (whether it's there or not).

ADD REPLY • link 10.9 years ago by Mikael Huss 4.8k

score 0 · Answer 1 · 2013-06-18

0

Entering edit mode

11.9 years ago

black_hoodies • 0

I don't know what you mean regarding the spliced alignments joining the same two exons in the gene, however have you perhaps considered that the "regular" alignments are in fact PCR-duplicates?

ADD COMMENT • link 11.9 years ago by black_hoodies • 0

0

Entering edit mode

I don't think PCR duplication is the problem, as the picture is close to identical after deduplication. That also wouldn't explain the split-read alignments (which are by the way also not PCR duplicates as they have distinct starting positions although the spliced-out part [i.e. intron] is the same in each case.)

ADD REPLY • link 11.9 years ago by Mikael Huss 4.8k