I have bam files from single cell sequencing of a patient. I want to see if using the rna sequencing data I can see if that patient has the TREM2 R47H mutation. Is there a way to get this from the bam file, and if so, how?. Appreciate any help!
I have bam files from single cell sequencing of a patient. I want to see if using the rna sequencing data I can see if that patient has the TREM2 R47H mutation. Is there a way to get this from the bam file, and if so, how?. Appreciate any help!
I have a tutorial for this on 10X data using vartrix posted here that more or less does what you want. Note that position of the mutation in the gene matters (5' vs 3' bias), as does expression. Still, assuming the gene is expressed at a reasonable level, you can usually find evidence of the mutation at a population level, e.g:
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
If you have a bam file containing sequence reads derived from patient RNA, you could query if any reads cross the locus of the mutation using something like: samtools view in.bam chr3:1500-1510 and if you have reads across the region you could pile them up to see the sequence using samtools mpileup -f genome.fasta in.bam -r chr3:1500-1510 but given that it's RNA you'd have to do some deciphering.
Thanks for the reply. When I run this command, samtools view in.bam chr3:1500-1510, it does not output anything. Am I doing something wrong. Or could it be that there are no reads in that region?
I hope you realize that chr3:1500-1510 are simply example coordinates that I made up, and that you are actually using the coordinates of your known mutation. The samtools view command, with region coordinates, extracts reads that overlap the region. If there are no reads covering the region, the command will return no output (which would match your second guess). To test this, you should make sure you are able to return some reads from a region which you know to be covered (maybe the 3' end of a highly expressed gene). If you are able to return reads from a positive control region, but not the region containing your mutation, I would interpret that as no reads covering your region of interest. By the way, can't you simply look at the BAM file in IGV to see if your region is covered, and what the sequence of the reads is?
Thank you for your reply. And I did figure out that that example sequence you gave did not have any reads. Checked other known loci and got reads mapped to them. Also, did think of viewing it IGV, but I wanted to look at the exact sequence of the reads in that region to see if it had the mutation a snp in the sequence at the position the mutation is found. Your advice was really helpful and helped me solve this issue. Truly appreciate the help.