Cankut CUBUK
Computational Genomics Program - Systems Genomics Lab
Centro de Investigación Príncipe Felipe (CIPF)
C/ Eduardo Primo Yúfera nº3
46012 Valencia, Spain http://bioinfo.cipf.es
According to the description file these should be Entrez/LocusLink gene IDs.
For instance, the first one, is LOC100130426, a hypothetical locus. This may explain why many don't have HGNC names. Check out the description in the workflow.
---snip---
File: *.trimmed.annotated.gene.quantification.txt
gene: This is the Entrez/LocusLink gene symbol followed by the
Entrez/LocusLink gene ID.
raw_counts: The number of reads mapping to this gene.
median_length_normalized: This is the total aligned bases to all transcript
models associated with this gene divided by the mean transcript length.
RPKM: See the DESCRIPTION.txt file in the mage-tab bunlde for
information on how this is calculated.
Thanks for the solution Ryan, But the links that you posted are broken now. Can please update them?
Since the "TCGA Data Portal is no longer operational" where can we find the mapping between TCGA gene Id to Entrez Gene IDs.
To be specific I'm working with the BRCA dataset and would like to get the Entrez ID's for my corresponding TCGA IDs.
I want to know this as well. Will find out for you.
What cancer type and which files specifically?
for example: sample TCGA-A6-2683-01
I have another question.
Some of the
gene_ID
s has string extension as "_calculated"What does it mean?
Example:
Cheers
Cankut CUBUK
Computational Genomics Program - Systems Genomics Lab
Centro de Investigación Príncipe Felipe (CIPF)
C/ Eduardo Primo Yúfera nº3
46012 Valencia, Spain
http://bioinfo.cipf.es
Please post this as a new question rather than adding it as an answer to a year old question.
Ok I will do, thanks