Get length of the alignment from bam file with shell
0
0
Entering edit mode
22 months ago
octpus616 ▴ 120

HI,

I want to get following fields from a bam file:

  • Read_length
  • Alignment_length
  • Edit_distance

I known I can get this fields from pysam, But I want to do this step with shell, I have the following scheme

  • Read_length use awk calculate length of reads
  • Alignment_length use awk calculate MD fields?
  • Edit_distance get from NM fields

I'm not sure if my understanding of Alignment length is correct, how should I get this information correctly?

for example, if I have a MD field:

10A3T0T10

So is the alignment length is

10+1(A)+3+1(T)+0+1(T)+10

but if we have

10A^3T0T10

here is also alignment length is 10+1(A)+3+1(T)+0+1(T)+10 or 10+1(A)+1(^)+3+1(T)+0+1(T)+10, The read.query_alignment_length in pysam output as same as previous

update:

Can I use read.query_alignment_length in pysam to get alignment length?

NGS sam pysam samtools bam • 1.5k views
ADD COMMENT
0
Entering edit mode

I don't think you can get these fields from a bam with the shell. Bam files are in binary format and therefore not readable for humans :D

ADD REPLY
1
Entering edit mode

I think may samtools view can easily convert bam to readable sam format to stdout and use pipeline to awk or samtools is ok

ADD REPLY
0
Entering edit mode

True you can just use the --header-only flag and then you have all the info you need in a human readable format.

ADD REPLY
0
Entering edit mode

why reinventing the wheel with bash when you have numerous API (pysam, htsjdk, htslib... ) available.

ADD REPLY
0
Entering edit mode

Because I find a little difference between pysam read.query_alignment_length and I calculate from MD filed, see my updated case for 10A^3T0T10

ADD REPLY
0
Entering edit mode

MD is not needed to calculate length of alignment, only CIGAR (which tells you length of the aligned segment for example)

ADD REPLY

Login before adding your answer.

Traffic: 2531 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6