I want to look at two separate groups of TCGA-BRCA RNA-seq samples. One with a specific mutation and one without. What I have done is download the RNA-seq expression for all the TCGA-BRCA samples and create a gct file. However, I am unsure of how to create a cls file which designates the samples as either "wildtype" or "mutation". There are over 3000 samples in the gct file. I have a list of barcodes of sample with the mutation, but I'm not sure how to use this to generate the cls file. Does anyone have any insight?
Hi Kevin, thanks for the summary. Was curious if you knew a way I could assign a phenotype label that corresponds to the each sample. I have a list of samples with the mutation and without, but going through 3000 samples assigning labels to each seems tedious. Perhaps I could use an R script.
Oh, I see what you mean. Do you have a sample of the input data that you've got?