Questions: rsem.genes.normalized. data in GDC Legacy Archive file name to identify normal and tumor samples
1
1
Entering edit mode
8.2 years ago
Peter Chung ▴ 210

I am new in TCGA GDC Legacy Archive. I would like to ask some questions about it. I downloaded a list of gene expression files below

unc.edu.a2b8f7d7-3e9b-4e7f-9198-fb1d75a2eb0a.2563597.rsem.genes.normalized_results unc.edu.f5487583-cdac-42ba-8f7f-b1704ea21cad.2331596.rsem.genes.normalized_results unc.edu.25e3cf1c-d40b-45ab-b987-2370cc638868.2326687.rsem.genes.normalized_results unc.edu.6e3a2a2a-40c4-499c-bb8b-5b6a10862eb9.2326707.rsem.genes.normalized_results unc.edu.3baf5546-99d9-417b-97bb-e00dda7ebcdd.2556104.rsem.genes.normalized_results unc.edu.d9bb2e43-26d3-49f8-92bb-604b4d3005ec.2350242.rsem.genes.normalized_results unc.edu.6c7bd630-bd5d-45b6-9936-56ad3e4f72a6.2326671.rsem.genes.normalized_results unc.edu.9fe4c0e2-bd31-453d-a6a2-3ceac8314fe8.2651589.rsem.genes.normalized_results unc.edu.4baa577f-afc5-4df6-9307-d73dca8ea70e.2326571.rsem.genes.normalized_results unc.edu.11dbf166-e0e8-4183-bd7f-465ed95deba2.2548426.rsem.genes.normalized_results

However, I can't identify which is normal or tumor samples because I can't match the corrsponding TCGA-xx-xxxx-01A- or TCGA-xx-xxxx-11A- in the manifest file. How can I match the unc.edu.xxxx to the TCGA-xx-xxxx- in order to identify them.

I tried to search and work on them like a week and I still can't figure it out. Thank you very much for your help!!

TCGA RNA-Seq • 2.4k views
ADD COMMENT
0
Entering edit mode
8.0 years ago

Hello Peter,

I don't know if it is still of interest to you, but you can identify the corresponding TCGA bar code for a GDC file like this:

Add the files you are interested in to your cart in GDC. Open your cart and download the metadata file for your cart (by clicking the Metadata button right next to the Download button). This file contains a JSON string that, among others, maps a file name (in the field "file_name") to a TCGA barcode (in the field "entity_submitter_id").

There you see that e.g. the file unc.edu.a2b8f7d7-3e9b-4e7f-9198-fb1d75a2eb0a.2563597.rsem.genes.normalized_results has the bar code TCGA-3H-AB3U-01A-21R-A40A-07

Hope this helps!

ADD COMMENT
0
Entering edit mode

Its excellent help.

Thanks

ADD REPLY

Login before adding your answer.

Traffic: 2850 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6