Bowtie2 Reads that aligned concordantly exactly 1 time
2
1
Entering edit mode
9.6 years ago
AW ▴ 350

Hi,

I have mapped some Illumina reads onto my reference genome using Bowtie2. I only specified --nounal and left all other parameters at default.

111247311 reads; of these:
  111247311 (100.00%) were paired; of these:
    19249548 (17.30%) aligned concordantly 0 times
    81185571 (72.98%) aligned concordantly exactly 1 time
    10812192 (9.72%) aligned concordantly >1 times

I wish to extract the 81185571 reads that aligned concordantly 1 time.

I can extract concordant reads with samtools view -f 0x2, but I wonder how Bowtie2 defines reads that map concordantly exactly 1 time?

Can I extract these using the absence of XS:i or is it based on a MAPQ score threshold?

Thanks!

Bowtie2 SAMtools • 6.7k views
ADD COMMENT
0
Entering edit mode
ADD COMMENT
0
Entering edit mode

Thanks, that's a very helpful link!

My question was slightly different though. I want to know how Bowtie2 defines reads that aligned concordantly exactly 1 time?

Does it use the absence of XS:i or is it based on a MAPQ score threshold?

Thanks

ADD REPLY
0
Entering edit mode
9.6 years ago

I vaguely recall that it looks for alignments where AS:i == XS:i, which will have a MAPQ of 0 or 1. Note that an alignment can also have a MAPQ of 1 if this is not the case, so you can't just samtools view -cq 2 blah.bam and get the same number.

ADD COMMENT
0
Entering edit mode

Thanks for your answer.

I have tried the following to get the 81185571 read pairs that aligned concordantly 1 time.

1. Extract concordant pairs where both reads map only once (i.e. unireads). This is based on absence of XS:i: flag.

Number of pairs remaining (absence of XS) = 74507750

2. Extract concordant pairs where AS:i == XS:i for both reads.

Number of pairs remaining (AS != XS) = 88621585

3. Extract concordant pairs where AS:i > XS:i for both reads.

Number of pairs remaining (AS > XS) = 83088150

Still cannot get the same number reported?

Any other suggestions would be greatly appreciated!

ADD REPLY
1
Entering edit mode

You'll have to go through the source code then. There's no documentation on this.

ADD REPLY
0
Entering edit mode

Yes, I found that the aln_sink.cpp file contains the instructions for printing number of concordant reads etc and that repetitive mapping appears to be specified by the pairMax value, but I have no idea where or how the pairMax value is set.

Thanks for your help!

ADD REPLY
0
Entering edit mode

Welcome to the joys of figuring out undocumented parts of packages. At some point you have to ask yourself if getting the answer is really worth the hassle.

ADD REPLY

Login before adding your answer.

Traffic: 2572 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6