So I have thought about your answer Dan and I have the following followup questions and experimental design to posit
First of all, I understand the rationale behind the overestimation of non-synonymous mutations as a result of the inclusion of germline mutations in addition to somatic mutation via RNA-seq analysis, since the reference is not the individual's DNA, rather a generic genome.
My first followup question regarding this issue is the following:
Is the occurence of germline mutations evenly distributed throughout a population? In other words, could we make the assumption that each sample's non-synonymous mutational load estimations will be falsely elevated in a relatively similar fashion making inter-sample comparison still meaningful?
Secondly, I understand that the analysis of non-synonymous mutations via RNA-seq would also be confounded by the loss of low expressed genes.
Again, could we assume that this loss be evenly distributed throughout the population allowing for inter-sample comparisons?
What are people's thoughts about RNA-seq's ability to answer the following question, given the confounding factors listed above:
Experimental question:
Tumor non-synonymous mutational load assessed via RNA-seq (with inherent limitations discussed above) is correlated with the expression of gene X.
Background information:
Essentially I am trying to recapitulate a type of analysis that is already found in the literature, but done with WES, via RNA-seq because we do not have matched normal control blood so cannot do WES tumor to normal DNA comparison.
Scenario:
Tumor non-synonymous mutational load may predict an inflammatory tumor microenvironment, as a result of immune recognition of a mutant peptide (created as a result of DNA somatic mutation) presented via MHC class I molecule, that can lead to targetable elements, such as immune checkpoints, in cancer immunotherapy. Thus, tumor non-synonymous mutational load can act as a predictor of response to these therapies.
See reference below:
Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer
Conclusion
Ultimately the precision to quantifying exactly how many non-synonymous mutations a tumor has is not required for such an analysis, rather the relative mutational load and its ability to predict an inflammatory microenvironment, via upregulation of distinct genes such as PD-L1, is what is necessary. Therefore if the limitations in RNA-seq's ability to answer this type of question is in its over or under quantification of mutations, but inter-sample comparisons would still be valid, this may be a viable avenue to explore. Further, the inability to identify mutations due to the loss of low expression mRNAs is probably not an issue at all because if their expression is low, they are likely not the peptide triggering an immune response.
Are the other limitations of RNA-seq people can point to that would make such an evaluation uninterpretable?
Alternative Allele Expression might be one of the reason I think
Hi All
Are you aware whether anyone has published data on the calculation of TMB from RNAseq (and compared to the TMB obtained from WES)?
Currently, there are multiple panels that on the basis of sequencing of limited gene sets (DNA) on tumor only can provide a good estimation of the real TMB, a recent 2019 review published here https://esmoopen.bmj.com/content/4/1/e000442
Is anyone aware of data where RNAseq was used in a similar manner, with RNAseq on tumor samples only?
Thank you all in advance