Question

Using RNA-Seq to identify non-synonymous mutational load

0

Entering edit mode

8.7 years ago

G4G • 0

Very basic question:

Why can you not use RNA-seq data to identify non-synonymous mutational load (ML) in a tissue specimen, such as a surgically obtained tumor?

Instead, matched normal (blood) and tumor samples are used to identify ML in tumor, but I want to understand why RNA-seq data cannot perform the same function by sequencing the mutant RNAs that result from DNA mutations.

I am fairly certain this is not possible, but would be grateful if someone could put the reasoning behind the non-feasibility of such a process in plain language.

Thank you!

RNA-Seq genome • 4.6k views

ADD COMMENT • link 8.7 years ago by G4G • 0

1

Entering edit mode

Alternative Allele Expression might be one of the reason I think

ADD REPLY • link 8.7 years ago by Sam ★ 4.8k

0

Entering edit mode

Hi All

Are you aware whether anyone has published data on the calculation of TMB from RNAseq (and compared to the TMB obtained from WES)?

Currently, there are multiple panels that on the basis of sequencing of limited gene sets (DNA) on tumor only can provide a good estimation of the real TMB, a recent 2019 review published here https://esmoopen.bmj.com/content/4/1/e000442

Is anyone aware of data where RNAseq was used in a similar manner, with RNAseq on tumor samples only?

Thank you all in advance

ADD REPLY • link 4.9 years ago by MD • 0

Ram · Answer 1 · 2016-02-25

If you are looking at the mutational load of the tumour than you want to disregard the germline variants of the individual, which is why matched tumour-normal sequencing is done. While we can eliminate many of the polymorphisms of an individual using resources like dbSNP, ExAc, 1000 Genomes, UK10K, etc it won't remove all of them. So you would significantly overestimate mutational load. In addition, as @Sam pointed out in the comment, RNA-Seq will be effected by expression effects as well. Up and down regulation of genes will mean you can't estimate the allele frequency of particular variants compared to sequencing DNA, this is important for quality control purposes and filtering of your data, and may be important for interpreting mutational load and clonal evolution/tumour heterogeneity. Further, again as @Sam mentioned, sometimes particular alleles are silenced, so heterozygous mutations in the genome can look like homozygous mutations in RNA-Seq.

Basically, RNA-Seq is good for a lot of things but for the level of precise evaluation of the genome you want it really isn't appropriate. That said it might be important data in addition to your genomic sequencing if you are looking at structural variants, gene fusions, etc as well.

Ram · Answer 2 · 2016-02-25

Hi Sam,

Thanks for your response.

Can you elaborate?

I looked up alternate allele expression and found the following paper:

Identification of allele-specific alternative mRNA processing via transcriptome sequencing

That appears to describe a tool that you can investigate the underlying mechanisms responsible for alternative allele expression on RNA-seq data.

Putting my original question in another form:

Can I use RNA-seq to comprehensively identify the non-synonymous mutational load in a cancer tumor specimen via overlapping, annotation type approach, with long transcripts and paired-end reads? This would necessarily be based on a comparison to a reference genome. Is that the issue? That the reference genome is too generic and the comparison needs to be individual specific to know exactly what non-synonymous somatic mutations have occurred in the cancer cells?

Any thoughts are greatly appreciated!

Thanks!

Ram · Answer 3 · 2016-02-25

0

Entering edit mode

8.7 years ago

G4G • 0

Thank you Dan!

Let me think about your answer for a little while and let you know if I have any other questions.

Extremely helpful!!!

ADD COMMENT • link updated 6.2 years ago by Ram 44k • written 8.7 years ago by G4G • 0

Ram · Answer 4 · 2016-02-26

So I have thought about your answer Dan and I have the following followup questions and experimental design to posit

First of all, I understand the rationale behind the overestimation of non-synonymous mutations as a result of the inclusion of germline mutations in addition to somatic mutation via RNA-seq analysis, since the reference is not the individual's DNA, rather a generic genome.

My first followup question regarding this issue is the following:

Is the occurence of germline mutations evenly distributed throughout a population? In other words, could we make the assumption that each sample's non-synonymous mutational load estimations will be falsely elevated in a relatively similar fashion making inter-sample comparison still meaningful?

Secondly, I understand that the analysis of non-synonymous mutations via RNA-seq would also be confounded by the loss of low expressed genes.
Again, could we assume that this loss be evenly distributed throughout the population allowing for inter-sample comparisons?

What are people's thoughts about RNA-seq's ability to answer the following question, given the confounding factors listed above:

Experimental question:

Tumor non-synonymous mutational load assessed via RNA-seq (with inherent limitations discussed above) is correlated with the expression of gene X.

Background information:

Essentially I am trying to recapitulate a type of analysis that is already found in the literature, but done with WES, via RNA-seq because we do not have matched normal control blood so cannot do WES tumor to normal DNA comparison.

Scenario:

Tumor non-synonymous mutational load may predict an inflammatory tumor microenvironment, as a result of immune recognition of a mutant peptide (created as a result of DNA somatic mutation) presented via MHC class I molecule, that can lead to targetable elements, such as immune checkpoints, in cancer immunotherapy. Thus, tumor non-synonymous mutational load can act as a predictor of response to these therapies.

See reference below:

Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer

Conclusion

Ultimately the precision to quantifying exactly how many non-synonymous mutations a tumor has is not required for such an analysis, rather the relative mutational load and its ability to predict an inflammatory microenvironment, via upregulation of distinct genes such as PD-L1, is what is necessary. Therefore if the limitations in RNA-seq's ability to answer this type of question is in its over or under quantification of mutations, but inter-sample comparisons would still be valid, this may be a viable avenue to explore. Further, the inability to identify mutations due to the loss of low expression mRNAs is probably not an issue at all because if their expression is low, they are likely not the peptide triggering an immune response.

Are the other limitations of RNA-seq people can point to that would make such an evaluation uninterpretable?