Hello,
I have TCGA breast cancer RNAseq V2 data and I would type to find the subtypes using the PAM50 gene set. After reading multiple posts, I'm still confused about the process.
It seems like the following is recommended for my problem.
PAM50Preds<-intrinsic.cluster.predict(sbt.model=pam50, data=dataset, annot=dannot, do.mapping=TRUE, verbose=TRUE)
table(PAM50Preds$subtype)
However, I have the following questions.
- Is the model pam50 trained on microarray data, and thus need to be refitted for RNA-seq data?
- Because I have new data, do I first need to use intrinsic.cluster first to fit the model before prediction?
Basically, I want to check whether for my data, I can simply just plug in the model from genefu and predict or if there is a step that needs to come before.
Thanks
I've been thinking about this recently, for your information, here is what I've found,
The PAM50 was actually first trained on qRT-PCR data, http://ascopubs.org/doi/full/10.1200/JCO.2008.18.1370, but this paper seems to have also used microarray data, together with qRT-PCR data, for clustering with 189 breast tumors across 1,906 “intrinsic” genes
The 2015 TCGA Breast cancer paper, did adjustment of the RNA-Seq data first, then applied PAM50, http://www.cell.com/cell/abstract/S0092-8674(15)01195-2, in the SI,
But the 2012 breast cancer paper (Cancer Genome Atlas, 2012) used microarray data, I've found inconsistency between the two (~10%). I contacted Dr Perou, he commented that there are always inconsistencies in subtyping between different platforms, which is surprising to me.
Hey,
Thanks a lot for this information, it's really useful.
I contacted Prof Charles Perou but haven't yet received a response. I was wondering if you could please share the PAM50 classification for the whole BRCA cohort? It would be greatly appreciated! All I'd need is a list of IDs and PAM50 subtypes, so it should be a tiny file.
Many thanks in advance! A
For anyone interested, the PAM50 classification is reported in the Supplementary Data of the 2015 TCGA Breast cancer paper. For an even more comprehensive list of TCGA Breast cancer patients PAM50 classification you can get it from this paper doi.org/10.1016/j.ccell.2018.03.014 using the R Package TCGABiolinks