Question

Extract base information from aligned reads in a sam file

0

Entering edit mode

4.3 years ago

yliueagle ▴ 290

I generated a sam (bam) file by aligning fastq to reference using BWA. Now I would like to retrieve the the basepair at a given chromosome position, for example, chr12:11010100:11010100, for all reads aligned to this region, how can I do that? Thanks! (My expected workflow would be: (1) extract reads from sam that cover the querying position; (2) get the basepair from these reads at this position)

sam basepair reads • 1.7k views

ADD COMMENT • link 4.2 years ago by yliueagle ▴ 290

0

Entering edit mode

Look into using igvtools which is part of IGV. I assume you are looking to get counts of basepairs (alleles) at a particular location.

ADD REPLY • link 4.3 years ago by GenoMax 147k

0

Entering edit mode

Not only counts, but the basepair at that position for each read, with read name included in the output. (Of course the counts can be generated from this table as a downstream step)

ADD REPLY • link 4.3 years ago by yliueagle ▴ 290

1

Entering edit mode

Due to the fact that NGS data is very 'messy' (désordonnée), every base position will have multiple bases. You may want to output the frequencies of these via: A: Calculate The Frequency Of Nucleotides At Each Position In An Mpileup File

ADD REPLY • link 4.3 years ago by Kevin Blighe 88k

0

Entering edit mode

Is this good enough? samtools view <your bam> chr:start-end > reads.sam

ADD REPLY • link 4.3 years ago by JC 13k

0

Entering edit mode

That will get the reads but not specific base at that position. Not sure what is the best way to do that.

ADD REPLY • link 4.3 years ago by GenoMax 147k

0

Entering edit mode

Thank you. This do obtain the reads but I also need to retrieve the basepair information from reads.sam at chr:start-end

ADD REPLY • link 4.3 years ago by yliueagle ▴ 290

score 2 · Accepted Answer · 2020-08-23

2

Entering edit mode

4.2 years ago

yliueagle ▴ 290

I found http://lindenb.github.io/jvarkit/Sam2Tsv.html can do the job. Thanks all!

ADD COMMENT • link 4.2 years ago by yliueagle ▴ 290