Calculating TMB for TCGA data
1
0
Entering edit mode
5 weeks ago
Jaber ▴ 30

Hi, community, I am facing a bit of a struggle to find the genomic region in Mb for the TMB calculations,

I am using the TCGA-READ data, where can I get the Mb value?

I want to calculate the TMB as follows = Total number of mutations / Mb.

I navigated Biostars' previous posts and couldn't find an answer to my question. Also, I looked up TCGA and didn't find any information regarding the Mb value (Megabase).

Is there any way I can infer the Mb from my data? Maf file of TCGA-READ I mean

I really appreciate any help you can provide.

mutations maftools TMB • 614 views
ADD COMMENT
0
Entering edit mode
4 weeks ago
Zhenyu Zhang ★ 1.2k

Goto GDC workflow overview page https://github.com/NCI-GDC/gdc-workflow-overview, and looking for target capture kit information at the bottom.

I remember GDC treated all TCGA WXS data as "Nextera Rapid Capture Exome v1.2", but you probably want to do API queries of GDC read group to confirm.

ADD COMMENT
0
Entering edit mode

Thank you really appreciate it

ADD REPLY
0
Entering edit mode

Thank you for help,

However, I could not identify the Mb for TCAG-READ, where should I do API queries to identify which kit they used for this dataset?

ADD REPLY
0
Entering edit mode

https://github.com/NCI-GDC/gdc-workflow-overview/blob/04b73036022ff1f53921dff5ef3b1b638b8ecfcd/gdc_target_capture_kit_size.tsv#L4

You can use this value. Just to clarify, this is not the exact capture kit TCGA has used, just GDC uses this as the default. TCGA has used like 40+ different kit/combinations, and sometimes the same file was generated of read groups from different kits.

ADD REPLY

Login before adding your answer.

Traffic: 1386 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6