Hi
I am relatively new to NGS analysis. We have SNP results from RNA-seq data using Samtools. I am trying to identify the reads that do and don't support a SNP at any given loci. I am able to get the number of reads supporting the SNP or ref but I don't know if there's a way to identify the exact ID of reads. I was able to get the reads spanning a SNP but don't know how to identify which have the SNP and which don't (considering a heterozygous situation).
Any help would be really appreciated.
Thanks.
I am trying to understand, you want the reads that do not overlap any SNP ?
Personally, I don't think there is a direct way to get the read id information of the reads supporting a SNP if you are using Samtools for SNP calling. The read ID information is lost when BAM is converted to mpileup format. If you have very small number of SNPs then you can try doing it manually by first extracting reads overlapping the given SNP from the BAM and then perform sequence alignment with the corresponding region of the reference genome OR you can write some code of your own that will do it automatically. Alternative would be looking for a variant caller that provides the Read IDs of the supporting reads.