Hello,
I'm analyzing TCGA breast cancer data to classify the samples into their respective subtypes, ad then to check if the genes of our study have a subtype specific pattern of expression. To do this, I was suggested to use genefu
. At the step of classifying the subtypes -
PAM50Base <- molecular.subtyping(sbt.model = "pam50",data=data, annot=annot,do.mapping=F)
I get an error -
Error in intrinsic.cluster.predict(sbt.model = pam50.robust, data = data, :
no probe in common -> annot or mapping parameters are necessary for the mapping process!
In the command, annot is the file used for annotation and is of the format -
probe EntrezGene.ID Gene.ID Gene.Symbol
Data refers to the input file, which is of the format -
Gene.Symbol Sample1 SAMPLE2 ... Sample 1092
Both files are tab delimited. I want to know if anyone has done this before, and if the file formatting is correct?
P.S. I have tried using the Gene.Symbol
and probe
in the data file, but both give the same error.
Edit: Should my data file also contain the EntrezGene.ID
column?
Thank you.
Hi vinayjrao,
Try using
do.mapping=TRUE
in themolecular.subtyping()
function.Matina
Hi Matina,
Thank you for the suggestion, but
do.mapping=T
gave me the following errorError in data1[, gg.uniq, drop = FALSE] : subscript out of bounds
. Could you also explain how doesdo.mapping
affect the run?Thanks.
Hi vinayjrao,
I think the problem is with the
data
matrix;molecular.subtyping
fuction expects a matrix of samples(rows) x genes(cols). As I can see above your data matrix is genes x samples, right? Try transposing the matrix. Let me know if it solved your problem!From the genefu vignette for do.mapping:
TRUE if the mapping through Entrez Gene ids must be performed (in case of ambiguities, the most variant probe is kept for each gene)
Matina
Dear Matina,
I tried transposing the data matrix. It still gave me the same error as the first time. Would it be helpful if I shared the script with you?
Thanks.
Edit: I was looking into the column names, and there was an error on my part. Transposing the data helped. Thank you very much for all the help :)
Great! Happy to help! I'll post it as an answer then.
@vinayjrao I am having the same error. Can you elaborate on how transposing the data solve your problem? Thanks
In my case, I encountered the same error message when column names were "Entrez gene ids". When I convert column names into "gene symbols" molecular.subtyping function worked.
In my case, the problem was resolved when I converted column names into gene symbols (in the transposed matrix). Hope that helps some of you who encounter the same error message after transposing the matrix.
Hi vinayjrao,
I have exactly the same problems as you and I do not manage to resolve it by transforming the data. Can you please share the annotation file with me? Don't understand what I am doing wrong.
Best, Linnea