Question

Identifying Reads That Do And Don'T Have Snp At Any Given Loci

0

Entering edit mode

11.6 years ago

pm2013 ▴ 50

Hi

I am relatively new to NGS analysis. We have SNP results from RNA-seq data using Samtools. I am trying to identify the reads that do and don't support a SNP at any given loci. I am able to get the number of reads supporting the SNP or ref but I don't know if there's a way to identify the exact ID of reads. I was able to get the reads spanning a SNP but don't know how to identify which have the SNP and which don't (considering a heterozygous situation).

Any help would be really appreciated.

Thanks.

rna-seq snp ngs samtools • 3.1k views

ADD COMMENT • link 11.6 years ago by pm2013 ▴ 50

0

Entering edit mode

I am trying to understand, you want the reads that do not overlap any SNP ?

ADD REPLY • link 11.6 years ago by Gabriel R. ★ 2.9k

0

Entering edit mode

Personally, I don't think there is a direct way to get the read id information of the reads supporting a SNP if you are using Samtools for SNP calling. The read ID information is lost when BAM is converted to mpileup format. If you have very small number of SNPs then you can try doing it manually by first extracting reads overlapping the given SNP from the BAM and then perform sequence alignment with the corresponding region of the reference genome OR you can write some code of your own that will do it automatically. Alternative would be looking for a variant caller that provides the Read IDs of the supporting reads.

ADD REPLY • link 11.6 years ago by Ashutosh Pandey 12k

score 0 · Answer 1 · 2013-04-27

You should be able to query for reads in the BAM that overlap with the snp you are interested, after that you have the positions where the reads mapped and compute where in the reads falls the putative snp, if the base at that position on the read is the wild type then you save the read, if not you continue onto the next read.

You can use bamtools to code this up:

https://github.com/pezmaster31/bamtools

score 0 · Answer 2 · 2013-04-29

0

Entering edit mode

11.6 years ago

pm2013 ▴ 50

Thanks all for your replies. Do you know of any variant caller that retains the read id info?

ADD COMMENT • link 11.6 years ago by pm2013 ▴ 50

0

Entering edit mode

At least in my knowledge none of the variant caller provides the read ids of the reads supporting a variant. I think it would be fun to write a small script as Raygozak suggested. It wont be that difficult I think.

ADD REPLY • link 11.6 years ago by Ashutosh Pandey 12k

0

Entering edit mode

Q&A site is different from forums. If you have new questions, please modify your original question or add a comment. Please don't create a new answer in this case.

ADD REPLY • link 11.6 years ago by lh3 33k

score 0 · Answer 3 · 2013-04-29

0

Entering edit mode

11.6 years ago

pm2013 ▴ 50

Thanks for your reply. I think I do have to resort to writing a script eventually. I am a biologist in training and my bioinfo/scripting is a bit rusty. I guess its time to polish those skills. I had thought of a similar approach as Raygozak had suggested. Will try to see if I can get it to work. Thanks a bunch guys!

ADD COMMENT • link 11.6 years ago by pm2013 ▴ 50

0

Entering edit mode

Sure let me know in case you need any advise. Make sure that you compare the complementary bases when you are looking into a read that aligns on the reverse strand.

ADD REPLY • link 11.6 years ago by Ashutosh Pandey 12k