Question

Pair ended samples cannot map to the genome

0

Entering edit mode

4.8 years ago

SayHey • 0

Hey there,

I'm trying to map pair ended samples to genome. But when I map them in the single ended way, they can be mapped.

I can get bam files in both two methods. Pair ended way leads to a really low mapping rate, on the other hand, single ended produce very high rate approximately 97%.

I use bowtie2 for mapping.

What is the problem?

ChIP-Seq • 1.2k views

ADD COMMENT • link 4.8 years ago by SayHey • 0

0

Entering edit mode

These are the arguments I used

SE:

opts='-p 10 --very-sensitive --end-to-end --no-unal --phred33 --trim3 8 -I 10 -X 700'

bowtie2 $opts -U ${inp} -x mm10 | grep -v 'chrEBV' | grep -v 'chrM' | grep -v 'chrUn_' | grep -v 'random' | samtools view -@ 10 -b > ${out}.bam

PE:

opts='-p 10 --very-sensitive --end-to-end --no-discordant --phred33 --no-mixed -I 10 -X 700'

bowtie2 $opts -1 $p1 -2 $p2 -x mm10 | grep -v 'chrM' | grep -v 'chrUn_' | grep -v 'random' | samtools view -@ 10 -b -f 2 -q 2 >${out}.bam

Bowtie2 version 2.4.1

ADD REPLY • link updated 4.8 years ago by rpolicastro 13k • written 4.8 years ago by SayHey • 0

0

Entering edit mode

The only immediate option I see why PE is worse than SE is library overamplification. If you do too many PCR cycles then at some point primers are wasted and the amplicons start annealing to each other causing what one calls the "PCR bubble". You would see than as a "shoulder" or "bump" in a Bioanalyzer track during library QC, so a second peak about double as large as the main PCR product. Do you have access to the wetlab QC stuff? I would remove the --no-unal option and all these other flags you used and then look at the crude alignment output. See whether you find evidence that using PE alignment you have lots of discordant alignments that might indicate this overamplification I mentioned. That should then manifest as lots of alignments with R1 on one and R2 on another chromosome, or if on the same chromosome very large insert sizes (basically random across the chromosome). I do not have a reference on whether this phenomenon is realistic, I rather think-aloud. Others might have better ideas.

Edit: Ok, maybe I am overcomplicating. Check read order first :-P

ADD REPLY • link 4.8 years ago by ATpoint 90k

0

Entering edit mode

Do you know the distribution of your library size? -X 700 limits your insert size and if most of your fragments are >700 nt this could explain the lack of mapping. -I is probably fine. Also why do you trim in the single-end alignment command and not for the paired-end one. If you have adapters you need to trim and that could also explain things.

ADD REPLY • link 4.8 years ago by benformatics 4.2k

0

Entering edit mode

I wonder if you've got your reads misssorted, so that they are not in the same order in the fastq for read1 as they are in the fastq for read2?

ADD REPLY • link 4.8 years ago by i.sudbery 22k