Entering edit mode
5.4 years ago
zeeshan.fazal1986
▴
10
Hi I have H3k27 chip-seq data for our cancer cell lines. I am using bowtie2 for alignment to human genome. My FASTQC looks very good. I used global end-to-end alignment option but my # unique alignments are around 55% in IP and %69 in Input samples. I reran Bowtie2 with local option with sensitive and very sensitive options but # of unique alignments drop to 44% from 55% and multimappers increased.
Is there any way to increase the # unique alignments by changing some parameters or it is fine to have 55% unique alignments in case of H3k27 mark.?
Thanks
It is typically not the protein you ChIP that determines the mapping percentage. Could be a contamination or much more likely insufficient read length. How long are your reads? If it is something < 50bp you might give
bowtie
(akabowtie1
) orbwa aln
a try.read length is 100bp
Did you try blasting a couple of unmapped reads?
how can I achieve this? I never did this? I have aligned .sam files. Should I get unaligned reads from .sam aligned file and BLAST the output?
Yes, manually blasting some random reads might give you an idea if contamination is the source of the low(er) alignment rate. Still, I would probably not be overly worried about the mapping result. Better continue with quality control by calling peaks and then inspect the samples on a genome browser. See if you got good separation of peaks from background.
When you say "unique" means you mean uniquely mapping as opposed to multi-mapping, correct? If that's the case, your 55-69% sounds pretty reasonable, what's your total percent mapping reads? There's no reason to exclude multi-mappers from the analysis.
On the other hand if you're talking about redundant reads (reads that are exactly identical), then you would want to think about optimizing the wet lab procedures - fragementation time, different antibodies, etc. But again, 55 - 69% of reads after collapsing redundant sounds reasonable to me.
Yes, Unique means mapped only 1 time as oppose to multimapper who mapped more than 1 time. In IP samples, Overall alignment is >75% and In Input sample its >90% over all alignment.
Why i should not remove multimappers?
Thanks
Why would you remove them? Unless you're following a specific pipeline or protocol that recommends removing them there's no need to. It's a lot of information that you're losing by artificially lowering your sequencing depth.
I've done side by side comparisons for RNA-seq with only uniquely mapping reads and with all mapping reads and have produced almost identical results. If you're concerned I'd recommend doing the same comparison to your ChIP experiment.
Very standard practice to remove multi-mapped reads in ChIP-seq analysis. The ENCODE and Roadmap pipelines do it, for example.
H3K27 is not a mark, H3K27me3 and H3K27ac are marks, and have very different properties. So you should be specific about what your ChIP-seq is targeting.
sorry. Its H3K27me3.