About the reads aligned 0 times (around 10%) in Bowtie2.
1
0
Entering edit mode
7.4 years ago

Hi, everyone. I have some questions about the reads aligned 0 times in Bowtie2 (About 10% in my experiment). What are these reads really are? And I wonder why there will be several reads cannot be aligned. Did these reads have effect on the peak‘s height we get in the following analysis? Furthermore, I learn that snp-calling will also use Bowtie2 sometimes. Will these unaligned reads affect the snp-calling? Maybe result in some lost on the information? Really appreciate for the answer.

ChIP-Seq SNP • 2.3k views
ADD COMMENT
4
Entering edit mode
7.4 years ago

These unaligning reads could be:

  • low quality reads (perhaps even containing Ns)
  • lab contamination from some other species (bacteria/fungi living on your species, human preparing your samples, sequencing machine wasn't cleaned properly and you get stuff from previous runs, that's what I've seen) paper
  • contamination from the sample prep (reagent), paper
  • stuff not present in the reference genome

If you're bored you can run metagenomics software like Kraken or MEGAN to see where your unaligned reads come from.

What do you mean by peak's height in the following analysis? what analysis, what peak?

These reads shouldn't have an effect on SNP calling as the software just analyses the alignments. The unaligning SNPs could harbour some SNPs if these reads are from your species, so you could assemble your unaligned reads and see whether you can find SNPs, but is the extra work worth it?

ADD COMMENT
2
Entering edit mode

To figure out where unexpected reads come from this approach might be interesting: Read Origin Protocol. The paper has an amazing title: Dumpster diving in RNA-sequencing to find the source of every last read, althought to my disappointment the more recent version of that preprint has a different title.

ADD REPLY
0
Entering edit mode

A great answer. Much appreciate. I am doing ChIP-seq analysis these days. And those unaligned reads do bother me since I am not quite understand why they can't be aligned and whether these unaligned reads may contain some information, such as the peak's location or height.

ADD REPLY
0
Entering edit mode

In WGS, you always get 5-15% unaligning reads, so I usually don't bother too much there. Not sure about the numbers for ChIP-seq...

ADD REPLY

Login before adding your answer.

Traffic: 1974 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6