What are the sematics of containing bwa split read alignments?
1
0
Entering edit mode
7.5 years ago
d-cameron ★ 2.9k

I'm parsing the output of bwa (0.7.15) and I'm getting some split read alignments that seem very strange. For example, take the following redacted record:

ReadName 2115 chrUn_JTFH01001478v1_decoy 12 1 250M chr21 10325621 0 * * NM:i:13 SA:Z:chr21,10325808,-,240M10S,9,19;

The read is aligned to both chr21 and the decoy contig. What I find strange is that although the read is reported as a split alignment with two SAM records, the entire read aligns to the supplementary alignment position.

What is the meaning of such a record? How can a genuinely split read contain an alignment that is essentially not split? Is this a bug in bwa? An artefact of alt contig mapping?

Edit: the SAM specifications have the following definition for the SA tag:

SA:Z:(rname ,pos ,strand ,CIGAR ,mapQ ,NM ;)+ Other canonical alignments in a chimeric alignment, formatted as a semicolon-delimited list. Each element in the list represents a part of the chimeric alignment. Conventionally, at a supplementary line, the first element points to the primary line.

bwa SA split read • 2.6k views
ADD COMMENT
0
Entering edit mode
7.5 years ago

IMHO this is a another alignment with a poor mapping quality and a large number of mismatches NM=13. I would imagine that the primaru alignment has a better mapping quality and a lower number of mismatches.

ADD COMMENT
0
Entering edit mode

The primary alignment is given in the SA tag: 19 mismatches (10 bases soft clipped), nominal mapq of 9.

The issue is that bwa is reporting it as a split read alignment using the SA tag, not an alternate alignment using the XA tag. There's also 6 alternate alignments in the XA tag that I removed for clarity as they are different alignment possibilities, and do not form part of a single split alignment.

One would expect split reads to have CIGARs something like "25M75S" on one record, and "25S75M" on the other because the aligner is reporting a split read alignment. In this case, the aligner is reporting a split read in which one of the alignments is not split at all.

ADD REPLY

Login before adding your answer.

Traffic: 2177 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6