Getting subtypes of cancer from TCGA
2
0
Entering edit mode
8.4 years ago

I am working with RNA-Seq data of renal cancer.I downloaded the data from GDC portal. Apparently I wanted to know the subtypes associated with each sample type. I scanned through the clinical meta data, biospecimen meta data even through some individual XML files but I did not find any header corresponding to subtype. https://tcga-data.nci.nih.gov/tcgafiles/ftp_auth/distro_ftpusers/anonymous/tumor/ is down otherwise perhaps I could have subtypes from here. Also I tried using using R package TCGA biolinks to get the subtypes but problem is that total number of samples retrieved are way less from TCGAbiolinks compared to what I manually download from GDC portal(Talking about cases). So any help from where I can retrieve subtype of each sample. Thanks in advance,

In addition has the TCGA data website been permanently shutdown because as such GDC data portal seems more intensive with more samples and whether it has been replaced by it.

RNA-Seq TCGA gdc • 4.3k views
ADD COMMENT
0
Entering edit mode
8.4 years ago
aditi.qamra ▴ 270

Hi,

You can use the R package of cBioportal for extracting the clinical data (http://www.cbioportal.org/cgds_r.jsp) The clinical data will also have the subtype information. You can follow the instuctions listed on the webpage.

ADD COMMENT
0
Entering edit mode

Thanks but again from above link also as I mentioned the samples are less compared to GDC portal. cBioPortal contains data from old TCGA data portal and has not been updated perhaps.

ADD REPLY
0
Entering edit mode
8.4 years ago
Mike ★ 1.9k

Also you can try RTCGAToolbox https://www.bioconductor.org/packages/devel/bioc/html/RTCGAToolbox.html and TCGA-Assembler http://www.compgenome.org/TCGA-Assembler/ , but Im not sure they are updated or not.

And make sure that Clinical data contains only cancers/tumor sample information, whereas expression data contains both types of samples (normal/cancer). So total numer of samples in cilinical data are lesser. so count both types of samples and find where is different and how much different.

ADD COMMENT
0
Entering edit mode

Thanks @Mike but the toolbox is out of service. As far as the clinical and expression data is concerned I am aware of the fact that expression profile contains both normal and tumor data from same patient. Surprisingly I was looking at broad institute data and somehow the data which they provide also contains the exact number of samples as gdc portal. But there how I can find subtype information.

ADD REPLY

Login before adding your answer.

Traffic: 2644 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6