Hi All,
What is difference between "Tumor Mutation Burden (TMB)" and "mutational load" ? I have mutation matrix for TCGA samples (generated from maftools, see below). I tried maftools and GenVisR but couldn't find any option to calculate TMB score. How can I calculate TMB score using mutation matrix?
Tumor_Sample_Barcode Frame_Shift_Del Frame_Shift_Ins In_Frame_Del In_Frame_Ins Missense_Mutation Nonsense_Mutation Nonstop_Mutation Splice_Site Translation_Start_Site total
TCGA-FW-A3R5 12 5 1 0 14725 789 6 514 44 16096
TCGA-FR-A726 12 5 2 1 5839 356 1 255 31 6502
TCGA-EE-A2MR 6 0 1 0 3621 216 1 112 10 3967
TCGA-D9-A6EC 9 1 2 0 3570 168 4 146 9 3909
Thanks
Thanks bruce.moran, I have TCGA MAF files (Mutation_Packager_Raw_Calls) downloaded from Broad Institute Firehose.
I guess that there is no defined standard for calculating this. I saw a citation from THIS paper that simply referred to THIS other original paper in NEJM, where it is stated:
So, they literally just tallied the missense mutations.
Thanks you so much Kevin, I also found a paper where they mentioned that..
Truncating mutations included nonsense, frame-shift deletion, frame-shift insertion, and splice-site, while non-truncating mutations included missense, in-frame deletion, in-frame insertion, and nonstop. Silent mutations were excluded from these analyses since they do not result in an amino acid change. Truncating mutations were given a higher weight considering their higher deleterious effects on gene function than non-truncating mutations. Based on the TMB score, we classified all the TNBCs into the higher-TMB and lower-TMB classes. If the TMB score in a TNBC was higher than the median value of TMB scores, the TNBC was classified as higher-TMB; otherwise it was classified as lower-TMB.
https://www.sciencedirect.com/science/article/pii/S1936523317303972?via%3Dihub
Still I am not sure which method is correct ??
That equation seems entirely random... why 1.5 and 1.0 as the weighting factors?
You could just tabulate them per patient like the NEJM paper, and then divide into 3 groups based on final count.
Another idea is to count per gene in any given patient, and then scale by dividing by the gene length. I got gene lengths previously by obtaining GENCODE
s reference FASTA transcriptome and using AWK to simply count the number of fields per gene (
NF`), with "" as delimiter.No right or wrong, really. Everybody seems to calculate it differently.
For a conservative estimate, I use non-synonymous mutations, but the recent MSK-IMPACT paper used all somatic muatations:
As Kevin says, no standard exists yet.
I don't see the relevance of weighting mutations by type. The idea of TMB analysis is to assess if a process that corrects DNA damage is not functioning, so the type of mutations are largely irrelevant, aside from the obvious translational importance.
Thanks Kevin and bruce.moran for your help