Entering edit mode
4.3 years ago
yliueagle
▴
290
I generated a sam (bam) file by aligning fastq to reference using BWA. Now I would like to retrieve the the basepair at a given chromosome position, for example, chr12:11010100:11010100, for all reads aligned to this region, how can I do that? Thanks! (My expected workflow would be: (1) extract reads from sam that cover the querying position; (2) get the basepair from these reads at this position)
Look into using
igvtools
which is part of IGV. I assume you are looking to get counts of basepairs (alleles) at a particular location.Not only counts, but the basepair at that position for each read, with read name included in the output. (Of course the counts can be generated from this table as a downstream step)
Due to the fact that NGS data is very 'messy' (désordonnée), every base position will have multiple bases. You may want to output the frequencies of these via: A: Calculate The Frequency Of Nucleotides At Each Position In An Mpileup File
Is this good enough?
samtools view <your bam> chr:start-end > reads.sam
That will get the reads but not specific base at that position. Not sure what is the best way to do that.
Thank you. This do obtain the reads but I also need to retrieve the basepair information from reads.sam at chr:start-end