Question

RNA-Seq variant calling dependent on expression?

1

Entering edit mode

9.4 years ago

Dave Tang ▴ 210

If a deleterious variant, such as a stop gain, leads to the degradation of the mRNA via nonsense-mediated decay, it would be impossible to call this variant using RNA-Seq data?

Related to this, the quality of variants called from lowly expressed genes would be lower than highly expressed genes? This seems logical but as I have never performed variant calling on RNA-Seq data, it may not be the case in practice.

I was just thinking about why we bother doing exome sequencing at all if we can call variants from RNA-Seq data, since from RNA-Seq, we also get expression and splicing information.

Thanks.

RNA-Seq variant calling • 2.9k views

ADD COMMENT • link updated 9.4 years ago by Devon Ryan 105k • written 9.4 years ago by Dave Tang ▴ 210

score 3 · Answer 1 · 2016-05-10

3

Entering edit mode

9.4 years ago

Devon Ryan 105k

As you surmised, coverage (required for accurate variant calling) is highly variable in RNAseq. Anything that causes non-sense mediated decay or any cases where you have allele-specific expression will cause wrong results if you do variant calling on them. Avoid calling variants on RNAseq data unless you absolutely need to.

ADD COMMENT • link 9.4 years ago by Devon Ryan 105k

1

Entering edit mode

Yes as Devon rightly pointed out, calling variants is not an ideal scenario unless you have no genetic data for your samples, in that case you can do that but yes the problem of splicing and coverage is in fact a bottleneck for the variants that can be called from RNA-Seq. So you end up with false positive variants in fact. However the GATK pipeline with STAR 2-pass somewhat makes the call less spurious (still there is high false positive and false negative calls) as till the end you end up with large call sets, in that case if you have some specific variants that you might be interested you can see them through the final realigned/recalibrated bam that was subjected to haplotypcaller or the final vcf file in any browser to see if they truly exist or you might score them with some tools to assign the functional/structural identity to those variants to rank their impact but having said that the problem of low coverage will still persist.

ADD REPLY • link 9.4 years ago by ivivek_ngs ★ 5.2k

0

Entering edit mode

Thanks Devon. I just needed some confirmation.

ADD REPLY • link 9.4 years ago by Dave Tang ▴ 210

0

Entering edit mode

Although it gets interesting if you can compare variant calls on exome sequencing and RNA-seq.

ADD REPLY • link 9.4 years ago by WouterDeCoster 48k

0

Entering edit mode

Here is a case report for comparing in tumor:

http://dx.doi.org/10.1016/j.ymeth.2015.04.016

ADD REPLY • link 8.6 years ago by shuomou • 0