Entering edit mode
9.2 years ago
bdeonovic
▴
210
I have read alignments in SAM format, I have read fastqs, I have genome fasta, and I have a VCF file with locations of SNPs. Is there an easy way to pull out the sequenced nucleotide for each read at the SNPs?
I'm not interested in calling SNPs (this is already done, I have my VCF) but I would like to know what nucleotide a read had at particular SNPs
The following posts are relevant. There is also a python code in the first post that works with the pileup format.
SAM has read sequences for each alignment as well as the alignment position -- you can calculate offset into the read that corresponds to your SNP's location and look at the base
I am trying to do the same. Did you find a solution?