I have performed a differential protein expression analysis using Progenesis QI and Proteome Discover. The differentially expressed proteins identified were used as the input for the ensemble BioMart and clusterProfiler packages in R to identify enrich pathways in my dataset. The ultimate output of all of this is a file listing the enriched pathways and the genes in those pathways which are present in my original dataset. An example of the data for 2 of the pathways is below
ID Description GeneRatio BgRatio pvalue p.adjust qvalue geneID
cge01200 Carbon metabolism 42/633 125/9112 8.83E-19 2.58E-16 2.16E-16 100761944/100767893/100763194/100767060/100736557/100758970/100767929/100765199/103158535/100751753/100689437/100771169/100765605/100771246/100689344/100760749/100774853/100755202/100760751/100762838/100771311/100773774/100750732/100689468/100772205/100774058/100773493/100757947/100764352/100770936/100771661/100754714/100689467/100762127/100762297/100765530/100765413/100774097/100759948/100770052/100754996/100764862
cge00100 Steroid biosynthesis 7/633 20/9112 0.0002 0.003 0.0029 100767580/100771700/100769920/100754784/100764429/100689192/100753498
what I would like to be able to do is list the genes individually so I can link the geneID with the gene symbol and protein.
The code for the clusterprofiler analysis is
> library(clusterProfiler)
> enrichedpaths_v7 <- enrichKEGG(gene = v7$NCBI_gene_ID,
> organism = 'cge',
> pvalueCutoff = 0.05)
I tried using the dypr package and separate as follows
> library ("tidyverse")
> library ("dplyr")
> as.character(enrichedpaths_v7$geneID)
> result_KEGG_analysis <- separate(enrichedpaths_v7, 9, into = paste("geneID", 1:50, sep = "/"))
I also tried
> result_KEGG_analysis<- separate (enrichedpaths_v7, 9, into = c("geneID", 50), sep = "9")
In both cases I get the following error
Error in UseMethod("separate_") :
no applicable method for 'separate_' applied to an object of class "enrichResult"
all suggestions/solutions gratefully accepted.
Peter
Can you try explicitly mentioning the
separate
to use (in the styletidyr::separate(...)
? Also, there is no need to import dplyr after importing tidyverse, the latter automatically imports the former. Plus, there is noseparate()
in dplyr.I think you'd benefit from using
tidyr::separate_rows
Hi Thanks for that suggestion. It didn't work but it got me thinking about the error message in a different way. I modified the code as follows
and now I get
Error: All nested columns must have the same number of elements.
My problem is that I can't change the number of elements in each column as this is a direct result of the analysis. Any suggestions?
tidyr::separate rows()
properly. Merely copy-pasting code from one function to another is not how functionality is achieved.Hi
tidyr::separate rows()
.I am a beginner with using R and bioinformatics and am trying to teach myself. However, it seems to me that I need to overcome the fact that in one row i have 42 genes listed and in a second row have only 7 genes listed.
as.character(enrichedpaths_v7$geneID)
converts the geneID vector (column) to character type and returns it to the caller (the R process, which then prints it on the screen because it has not been asked to do anything else with it). To replace the column with the new vector, you need to assign it to the column like so:enrichedpaths_v7$geneID <- as.character(enrichedpaths_v7$geneID)
. This is a common paradigm of all pass-by-value architectures, in which the function call does not affect the variable being passed but affects just a copy of it. R follows this paradigm as far as I know.Please read the documentation on
tidyr::separate_rows
and look at examples to understand how to use it.Thanks for that. I didn't realise that about the character format bit.