Question

Remove Uncharacterized chromosomes before alignment in chipseq

0

Entering edit mode

11 months ago

Chironex ▴ 50

Hi, I have a question. i am processing a Chipseq experiment on mm10 genome. I did quality check, trimming, alignment, duplicate removal. The "problem" Is that I did not remove Uncharacterized chromosomes from reference fasta genome. I was planning to remove them After peak calling. The question Is, should I repeat the analysis removing them from reference fasta file used as input in bowtie2 or could I move forward with the analysis because It doesnt affect so much? What do you think?

fasta Bowtie2 • 803 views

ADD COMMENT • link updated 11 months ago by ATpoint 86k • written 11 months ago by Chironex ▴ 50

0

Entering edit mode

It's fine and actually good not to remove them before alignment. Reads can come from these chromosomes, so removing takes away the true origin of the reads, potentially leading to spurious alignment to other contigs. You can remove from called peaks, or from the bam files to call the peaks. Both is fine.

ADD REPLY • link 11 months ago by ATpoint 86k

0

Entering edit mode

Thank you very much! I was concerned to remove potential reads that can fall in both parts ( One canonical and One non canonical chromosome, for example), that Will be discarded in downstream analysis (Picard) because are flagged as multiple by bowtie2. So I potentially could lose some read (maybe not significant Number). But, as you Say, they can also be ambiguous. I tried removing them from the bam, but then, the numbers of PaiR1 and pair2 when I do 'samtools flagstat' are not identical anymore (another point that I would to understand if Is It normal that happens?!? i suppose yes, because there isnt the same nr of reads that falls in non canonical chromosomes , for each pair... ) so, for this reason I planned to remove them After peak calling, with blacklist regions, but people suggest me to do It before, so I am 'little' confused about the best and right way to do It.

ADD REPLY • link 11 months ago by Chironex ▴ 50

0

Entering edit mode

I was concerned to remove potential reads that can fall in both parts ( One canonical and One non canonical chromosome, for example), that Will be discarded in downstream analysis (Picard) because are flagged as multiple by bowtie2

That's a multimapper and is usually discarded. I think there is no problem in that.

ADD REPLY • link 11 months ago by ATpoint 86k