Entering edit mode
2.9 years ago
bookorg
▴
20
Hi I faced a problem for converting my .csv table file into entrez id.My .csv file consist of the colums-SYMBOL baseMean log2FoldChange lfcSE stat pvalue padj. I analyzed a GEO data set and found 99 DEG .Now I want to see the functional enrichment analysis of that DEG.For that first I have to convert my gene symbol to entrez id,so i write my code bellow
df$EntrezID <- mapIds(x = org.Hs.eg.db,
keys=row.names(df),
column="ENTREZID",
keytype="SYMBOL",
multiVals="first")
But when i run that the output shows like that
Error in .testForValidKeys(x, keys, keytype, fks) :
None of the keys entered are valid keys for 'SYMBOL'. Please use the keys method to see a listing of valid arguments.
Kindly help me on that regard.Thanks in advance
When posting questions about ID/Symbol please provide examples.
i am not clear about the example.What type of example you wanted me to give.I said all the details and error that happened
Give us example of gene ID's that are generating the error. e.g. BRCA
SYMBOL- ORF1a ORF1b N S ORF6 ORF3a ORF7a M E ORF8 RNU1-28P ZFAND5 IGHV4-59 FNIP1 IGLV3-25 RAMP3 GLUL SNORD116-2 CLEC3B ANKRD36BP2 IGHG4 IGHM IGLV1-47 GADD45A ZNF638-IT1 HSPA1L IGHG3 IGLV3-21 IGLV4-69 RRM2B RNVU1-7 IGHV4-34 NUDT16 ITM2C MIR205HG SNORA38B IGLL5 SNORD17 LAG3 IGLV1-40 IFI27L2 RAMP2 IGHV1-18 GPCPD1 H2AFY ICAM2 IGKV4-1 SNORA13 CLK4 IGHV1-46 MMRN2 HIST1H2BF APLNR KIFAP3 CTA-796E4.5 HSP90AA2P IGHA1 RTKN2 SP140 HELLPAR RPA3 APOL4 BCL2L2 IGHV5-51 LMAN2 TMEM19 IGHV3-74 RP1-309I22.2 AL355075.1 AC068580.6 PDZD8 SELK IGKV1-5 IGLC3 AKAP2 CRIP1 IGHD IGLC2 HSH2D IGJ SCARNA6 C1orf226 MZB1 AL139099.2 BPIFB1 HIST1H2BN ST6GAL1 IGLV1-44 POU2AF1 RP4-671O14.7 UBE2G2 RLIM BANK1 EHBP1L1 GZMA IGLV1-51 PABPC4 PLIN2 PSMB10 This is the 99 gene that gives error
I am going to give you a solution using EntrezDirect. I assume you are referring to
gi
numbers when you sayEntrezID
. If that is not the case then let us know. BTW,gi
numbers are deprecated for end-users by NCBI.Put one gene symbol per line in a file. Example below.
It seems to me that there is something wrong with
row.names(df)
. Could you show the top 10 items from that vector? something likerownames(df)[1:10]
.My gene is in .csv file. and i gave the column name SYMBOL.there are also 6 column that are log2fc,p.value,adj.p.val.base mean,lfcSE,stat
The function complaint about the provided
keys
type, you have indicated that the row names of the data frame contain that data which I guess is not true. If you passdf$SYMBOL
to thekeys
argument, you might be good.