I used samtools and varscan for variant calling like this:
samtools mpileup \
-f genome.fa \
tku.sorted.rmdup.bam \
| java -jar VarScan.v2.3.5.jar pileup2snp \
--min-coverage 8 \
--min-reads2 2 \
--min-avg-qual 15 \
--min-var-freq 0.01 \
--p-value 0.001 \
>tku.snp.vcf
now I want to retrieve all the reads supporting a specific SNP/indel and calculate the distribution of SNP/indel on the sequenced reads, is there any tools or proper way to achieve this?
I always first check what is actually happening with IGV (http://www.broadinstitute.org/igv/). Here you can nicely see the mismatches/SNPs with the reference genome and get a general ID if the SNP is real.
Later you can extract the reads with this command: samtools view BAMFILE.BAM chr1:1000000-2000000 > NEWFILE.bam
Thanks for your advice! But it is still difficult to acquire the coordinate of a SNP on reads, what's more, there are thousands of SNPs to process. So, an automatic workflow is prefered.
i use python and pysam to read and manipulate reads but it requires some programming skills http://www.cgat.org/~andreas/documentation/pysam/api.html