If a deleterious variant, such as a stop gain, leads to the degradation of the mRNA via nonsense-mediated decay, it would be impossible to call this variant using RNA-Seq data?
Related to this, the quality of variants called from lowly expressed genes would be lower than highly expressed genes? This seems logical but as I have never performed variant calling on RNA-Seq data, it may not be the case in practice.
I was just thinking about why we bother doing exome sequencing at all if we can call variants from RNA-Seq data, since from RNA-Seq, we also get expression and splicing information.
Thanks.
Yes as Devon rightly pointed out, calling variants is not an ideal scenario unless you have no genetic data for your samples, in that case you can do that but yes the problem of splicing and coverage is in fact a bottleneck for the variants that can be called from RNA-Seq. So you end up with false positive variants in fact. However the GATK pipeline with STAR 2-pass somewhat makes the call less spurious (still there is high false positive and false negative calls) as till the end you end up with large call sets, in that case if you have some specific variants that you might be interested you can see them through the final realigned/recalibrated bam that was subjected to haplotypcaller or the final vcf file in any browser to see if they truly exist or you might score them with some tools to assign the functional/structural identity to those variants to rank their impact but having said that the problem of low coverage will still persist.
Thanks Devon. I just needed some confirmation.
Although it gets interesting if you can compare variant calls on exome sequencing and RNA-seq.
Here is a case report for comparing in tumor:
http://dx.doi.org/10.1016/j.ymeth.2015.04.016