TLEN sam format
1
0
Entering edit mode
6.3 years ago
hpapoli ▴ 150

Hi,

My question is about TLEN field in SAM format. Here is an example of the first 9 column of BAM file. Is TLEN (320) calculated as follows: (351+62) - 93 = 320?

SRR1144953.182159       117     NW_008793873.1  2       0       *       =       2       0       
SRR1144953.182159       185     NW_008793873.1  2       60      75M     =       2       0       
SRR1144953.8051227      117     NW_008793873.1  14      0       *       =       14      0       
SRR1144953.8051227      185     NW_008793873.1  14      60      75M     =       14      0       
SRR1144953.1496220      163     NW_008793873.1  93      60      75M     =       351     320     
SRR1144953.1496220      83      NW_008793873.1  351     60      62M13S  =       93      -320

Why TLEN is zero for the second and fourth lines?

Thanks!

samtools alignment • 4.0k views
ADD COMMENT
0
Entering edit mode

also, this doc. is very helpful - http://samtools.github.io/hts-specs/SAMv1.pdf

ADD REPLY
6
Entering edit mode
6.3 years ago
Amitm ★ 2.3k

Hi, You can check the meaning of the flags (Col.2) here - https://broadinstitute.github.io/picard/explain-flags.html

Thats Paired-end data and there are 3 pair of IDs in the Col.1. The first two pairs have their R1 unmapped and hence TLEN col. value is 0. The 3rd pair has flag values 163 and 83. That indicates read mapped in proper-pair.

ADD COMMENT
0
Entering edit mode

This is very clear, thank you! One more question, do a given pair always come together? That is, in the example above, we have 3 pairs and for each, we have one QNAME. However, in the next 3 lines of the BAM, we have the following:

SRR1144953.1448091      99      NW_008793874.1  31      60      75M     =       315     359     
SRR1144953.21253738     99      NW_008793874.1  74      60      75M     =       366     367     
SRR1144953.10190936     99      NW_008793874.1  86      60      75M     =       361     350

From Flag 99, I see that read is paired and it is mapped in proper pair. However, why don't I see a pair of QNAME is above? Sorry if my questions are too simple but I couldn't figure out these details from the document.

ADD REPLY
0
Entering edit mode

I think I see the patten. The file is sorted based on the position coordinate and there are other reads coming in between, I find their mates in the next lines as below:

SRR1144953.1448091      99      NW_008793874.1  31      60      75M     =       315     359     
SRR1144953.21253738     99      NW_008793874.1  74      60      75M     =       366     367     
SRR1144953.10190936     99      NW_008793874.1  86      60      75M     =       361     350     
SRR1144953.4004472      69      NW_008793874.1  230     0       *       =       230     0       
SRR1144953.4004472      137     NW_008793874.1  230     60      75M     =       230     0       
SRR1144953.3440089      69      NW_008793874.1  241     0       *       =       241     0       
SRR1144953.3440089      137     NW_008793874.1  241     60      75M     =       241     0       
SRR1144953.1448091      147     NW_008793874.1  315     60      75M     =       31      -359    
SRR1144953.10190936     147     NW_008793874.1  361     60      75M     =       86      -350    
SRR1144953.21253738     147     NW_008793874.1  366     60      75M     =       74      -367
ADD REPLY
0
Entering edit mode

Hi, Some of the parameters of the aligner being used can alter the way you see the alignments being reported. One is of course as you noted, coordinate-sorted output. Others that come to my mind are - 1) If secondary alignments (or, all valid alignments) are being reported, then you might see one unique read ID more than twice. 2) Also, if only mapped reads have been reported and for a read-pair only one of the mates was mapped, then you would see one entry only for the ID (instead of two)

The flag values should be helpful in inferring what is going on.

ADD REPLY

Login before adding your answer.

Traffic: 1696 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6