Entering edit mode
2.3 years ago
Rob
▴
170
Hi all,
I am using python documentation to download the CPTAC data for ccRcc. I don't find anything about the type of data. I want to know what kind of normalization was done on the data. There is nothing in the documentation. This is the workflow I followed for downloading the data. Does anyone know what type of data is this giving out?
As noted here the transcriptomic data is harmonized using GDC pipelines: https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/Expression_mRNA_Pipeline/
what does "harmonized" mean in data? I read all over the portal, still not clear to me what kind of change they did to data. is it a way of normalization?
Did you miss this: https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/Expression_mRNA_Pipeline/#mrna-expression-transformation
Thanks I didnot miss this but nowhere is mentioned which type of normalization was applied to the whole data file we get. it just explains types of normalization happening in each file separately. the whole file is just one type of normalization which is my confusion.
All the normalization method are in one file for gencode v36. They were in different files in the old v22 version.