TCGA barcode documentation? TCGA wiki?
1
2
Entering edit mode
6.6 years ago

Is there a publicly accessible website with detailed TCGA documentation, especially with TCGA barcode specification? A while ago there was: https://wiki.nci.nih.gov/display/TCGA/TCGA+Barcode (still indexed in Google, linked in a few Biostars responses and questions), but it now just shows an authentication request along with lines like:

"This system is provided for Government-authorized use only." "Unauthorized or improper use of this system is prohibited and may result in disciplinary action and/or civil and criminal penalties."

so it seems that this extremely useful resource is no longer available to the general public. Are there any alternatives? I tried https://cancergenome.nih.gov/abouttcga/aboutdata but it shows just general high level info. https://gdc.cancer.gov/resources-tcga-users/tcga-code-tables provides some of the important information, but is really useless without the barcode format description.

TCGA barcode • 5.9k views
ADD COMMENT
1
Entering edit mode
ADD REPLY
4
Entering edit mode
6.6 years ago

I managed to save the original page (as PDF on my GitHub account), but some images appear to have vanished: https://github.com/kevinblighe/TCGAbarcode/blob/master/README.md

This was the image that I used to recall seeing:

barcode

ADD COMMENT
0
Entering edit mode

Here is a cloned page on GDC website: https://docs.gdc.cancer.gov/Encyclopedia/pages/TCGA_Barcode/

ADD REPLY
0
Entering edit mode

Thank you very much for adding.

ADD REPLY
0
Entering edit mode

Dear Dr. Blighe I would like to have your comments about my 2 questions at TCGA batch effect process. I found that bach id in TCGA is a combine of "PlateId", "ShipDate", and ''Tissue Source Site". DX - Memorial Sloan Kettering is an example of a Tissue Source Site. am I right?

and the second question is about batch effect removal time. Now I downloaded FPKM data from the GDC portal. can I remove batch effects on FPKM expression data? or I have to download HTSeq - Counts and then removing the batch effects and then normal it by FPKM method?

P.S: I sent my question in another post but that is a close post and I copy my question in here.

ADD REPLY
0
Entering edit mode

What do you plan to do with the data? It has been stated a lot in various posts that FPKM expression units are not suitable for differential expression analysis.

ADD REPLY
0
Entering edit mode

Dear Dr. Blighe,

For DEG yes but I need cancer TCGA data for constructing network based on WGCNA. for that purpose, FPKM is good.

ADD REPLY
0
Entering edit mode

Okay, but correcting for batch with FPKM data will be difficult. There is no cross sample normalisation performed when deriving FPKM expression units. By attempting to correct for batch, you may distort the dataset even more.

I would simply recommend proceeding to network construction without any batch adjustment...or, obtain the HT-seq raw counts.

ADD REPLY
0
Entering edit mode

I found that bach id in TCGA is a combine of "PlateId", "ShipDate", and ''Tissue Source Site".

I am not certain that this is accurate. I believe there is more to the batch ID than the ship date and plate ID. The following is a tabulation of batch, plate, TSS, and ship date for TCGA-KIRC methylation data, acquired from https://bioinformatics.mdanderson.org/BatchEffectsViewer/ (run date: 2019-07-30-1200, current GDC). See batches 274.47.0 & 274.48.0 for example. Note, the processing centre code is 05 for all samples.

Ship date also varies within batch 90.69.0

     BatchId PlateId TSS   ShipDate  N
 1: 105.71.0    1670  B0 2011-04-13 10
 2: 105.71.0    1670  B4 2011-04-13  7
 3: 105.71.0    1670  B8 2011-04-13  4
 4: 105.71.0    1670  CJ 2011-04-13  7
 5: 105.71.0    1670  CW 2011-04-13  6
 6: 105.71.0    1670  CZ 2011-04-13  7
 7: 105.71.0    1670  EU 2011-04-13  4
 8: 274.47.0    A264  B8 2013-01-30  1
 9: 274.47.0    A264  DV 2013-01-30  3
10: 274.47.0    A264  MM 2013-01-30  1
11: 274.47.0    A264  MW 2013-01-30  1
12: 274.48.0    A264  B2 2013-01-30  1
13: 274.48.0    A264  B8 2013-01-30  3
14: 274.48.0    A264  DV 2013-01-30  1
15: 274.48.0    A264  MM 2013-01-30  1
16:  32.77.0    1275  AK 2010-09-27  2
17: 340.41.0    A33L  A3 2013-10-16  4
18: 340.41.0    A33L  B8 2013-10-16  1
19: 340.41.0    A33L  G6 2013-10-16  1
20: 340.42.0    A33L  B8 2013-10-16  3
21: 340.42.0    A33L  GK 2013-10-16  1
22: 387.37.0    A36Y  A3 2014-02-26  5
23: 387.37.0    A36Y  B8 2014-02-26  1
24: 387.37.0    A36Y  G6 2014-02-26  3
25: 387.37.0    A36Y  MM 2014-02-26  1
26: 387.37.0    A36Y  T7 2014-02-26  1
27: 387.38.0    A36Y  3Z 2014-02-26  1
28: 387.38.0    A36Y  6D 2014-02-26  1
29: 404.36.0    A39G  B8 2014-04-30  1
30:  50.78.0    A27A  B2 2013-03-13  2
31:  63.79.0    1275  AK 2010-09-27 10
32:  63.79.0    1275  B0 2010-09-27 58
33:  63.79.0    1275  B2 2010-09-27  1
34:  63.79.0    1275  B8 2010-09-27  1
35:  68.74.0    1418  A3 2010-11-22 12
36:  68.74.0    1418  B0 2010-11-22 54
37:  68.74.0    1418  B8 2010-11-22  6
38:  68.74.0    1418  BP 2010-11-22 16
39:  70.72.0    1424  BP 2010-11-22 56
40:  70.72.0    1424  CJ 2010-11-22 32
41:  70.72.0    1424  CZ 2010-11-22  6
42:  82.75.0    1500  AK 2011-01-12  1
43:  82.75.0    1500  B0 2011-01-12 29
44:  82.75.0    1500  B4 2011-01-12  2
45:  82.75.0    1500  B8 2011-01-12  1
46:  82.75.0    1500  BP 2011-01-12  2
47:  82.75.0    1500  CZ 2011-01-12 48
48:  90.68.0    1536  A3 2011-02-09  2
49:  90.68.0    1536  B0 2011-02-09 18
50:  90.68.0    1536  B2 2011-02-09  3
51:  90.68.0    1536  B8 2011-02-09  5
52:  90.68.0    1536  CJ 2011-02-09 10
53:  90.68.0    1536  CW 2011-02-09  9
54:  90.68.0    1536  DV 2011-02-09  9
55:  90.69.0    1536  B2 2011-02-09  2
56:  90.69.0    1536  CJ 2011-02-09  3
57:  90.69.0    A27A  B2 2013-03-13  4
     BatchId PlateId TSS   ShipDate  N
ADD REPLY

Login before adding your answer.

Traffic: 2163 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6