I have some UNC Illumina RNAseqV2 data with about 100 genes, 800 patients with UNC ID. I'd like to find the subtype of each tumor (normal, luminal A, luminal B, basal, HER2) for a classifier. Preferably with the UNC ID but if TCGA barcode is provided I believe it's possible to match them up. I can't find it on TCGA website. Maybe just looking in wrong places.
Should that mean that RNA seq data PAM50 is not a stable test?
@kanwarjag: I assume your answer is meant as a comment to my answer above; if so, please leave it as a comment next time rather than posting an answer to the question. What I mean by "somewhat different results for edge case tumors" is that, in practice, when you use genefu to assign PAM50 classes, the assignments are contingent on the centroid values used for the individual subtypes. For most tumors the assignment will be robust to small variations of these values, but in my experience there are about 10% of tumors that are sensitive to small variations in these parameters, and it's hard to get agreement on these tumors. This is not in any way related to RNAseq or the choice of technology; it's a function of how the test is performed.