I'm using the pathfindR package to perform PSEA on a set of DE proteins from a proteomics experiment. If I understand the package correctly one of the outputs should be a dataframe which I have called PSEA_results.
When I run the package that dataframe doesn't get created and I get the following error message.
Error in (function (filename = "Rplot%03d.png", width = 480, height = 480, : invalid 'filename'
This is a new one on me. My code is below.
library ("KEGGREST")
library ("KEGGgraph")
library ("org.Hs.eg.db")
library ("pathfindR")
library ("tidyverse")
library ("tidyr")
library ("dplyr")
library ("magrittr")
#remove all previous file
rm(list=ls())
#data summary needs pvalues added
#Data Summary of Progenesis Data, Enriched GO Terms and Enriched KEGG Pathways
attached_CHO_all_data <- read.csv("C:/PETER PROJECT/3d. Mass Spec Processing/Results of R Analysis/Data Summary/Effect of Culture Format/WCL Analysis/DE Cutoff equal 1.2/0% Attached v 0% Suspension/DE 1-2 0% Attached v Susp WCL Apr19 (CHO Database) - All Data.csv")
A_v_S_Comp_0rd <- read.csv("C:/PETER PROJECT/3d. Mass Spec Processing/Results of R Analysis/DE Normalised Abundance Data/Effect of Culture Format/WCL Analysis/DE Cutoff equal 1.2/0% Attached v 0% Suspension/DE 1-2 0% Attached v Susp Normalised Abundance Data WCL Apr19.csv")
colnames(A_v_S_Comp_0rd)
#find the average of each row relating to the 4 replicates of attached cells
Aver_A0 <- rowMeans(data.frame(A_v_S_Comp_0rd$A0b, A_v_S_Comp_0rd$A0c, A_v_S_Comp_0rd$A0d))
Aver_A0 <- data.frame(A_v_S_Comp_0rd$description, Aver_A0)
#find the average of each row relating to the 4 replicates of suspension cells
Aver_S0 <- rowMeans(data.frame(A_v_S_Comp_0rd$S0a, A_v_S_Comp_0rd$S0b, A_v_S_Comp_0rd$S0c, A_v_S_Comp_0rd$S0d))
Aver_S0 <- data.frame(A_v_S_Comp_0rd$description, Aver_S0)
#Divide each row of dataframe Aver_S5 by the corresponding row in dataframe Aver_A5
FC <- cbind(Aver_S0[1],round(Aver_S0[-1]/Aver_A0[-1],3))
names(FC)[2] = "FC"
#Get Log2 0f A0/S0
Log2FC = log2(FC$FC)
#select desired columns
PSEA_1<- data.frame(A_v_S_Comp_0rd$description, Log2FC, A_v_S_Comp_0rd$Anova..p)
#spilt column 1 in the dataframe PSEA_rd into 2 separate columns using the phrase refseq: as the spilt and call it PSEA_2
PSEA_2<- tidyr::separate (PSEA_1, 1, into=c("HGNC_Symbol", "Refseq"), sep = "Refseq:")
#further spilt column 6 in the dataframe PSEA_2 into 2 separate columns using = as the spilt and call it v7
PSEA_3<- tidyr::separate (PSEA_2, 1, into=c("gene", "HGNC_Symbol"), sep = "=")
#remove unwanted columns
PSEA_4 = subset(PSEA_3, select = -c(gene,Refseq))
PSEA_rd <- distinct(PSEA_4,HGNC_Symbol, .keep_all= TRUE)
#trim white space from protein names and make a dataframe
PSEA_rd$HGNC_Symbol <- trimws(PSEA_rd$HGNC_Symbol, which = "right", whitespace = "[ \t\r\n]")
#rename columns and sort on Gene.Symbol
names(PSEA_rd)[1] <- "Gene.Symbol"
names(PSEA_rd)[2] <- "LogFC"
names(PSEA_rd)[3] <- "adj.P.Val"
PSEA_rd <- arrange(PSEA_rd, Gene.Symbol, by_group = FALSE)
head(PSEA_rd)
str(PSEA_rd)
# PSEA_rd is the raw data file for PSEA
data(PSEA_rd)
#set a directory to store the results in. The default directory is is pathfindeR_Results
getwd()
PSEA_results <- run_pathfindR(PSEA_rd, output_dir = ("C:/PETER PROJECT/3d. Mass Spec Processing/Results of R Analysis/pathfindR/Culture Format/WCL/0% Attach v 0% Susp"))
#write.csv (PSEA_results, file = "C:/PETER PROJECT/3d. Mass Spec Processing/Results of R Analysis/pathfindR/Effect of Culture Format/WCL Analysis/0% Attached v 0% Suspension/DE Cutoff equal 1.2/PSEA_Attached_0.csv")
The data frame PSEA_rd has over 1000 lines in it so I don't propose to reproduce it here. However, just so people can see the types of data it contains the output from str(PSEA_rd) is below.
'data.frame': 1171 obs. of 3 variables:
$ Gene.Symbol: chr "AAAS" "AACS" "AARS" "AATF" ...
$ LogFC : num -0.648 0.82 1.837 -0.492 -1.114 ...
$ adj.P.Val : num 7.15e-05 4.59e-03 4.71e-05 1.38e-02 1.00e-02 ...
I did try one other thing. I ran
PSEA_results <- run_pathfindR(PSEA_rd)
as per this post Enrichment Analysis, Clustering and Scoring with pathfindR. The result was the same error message which hopefully rules out that the issue is caused by the directory structure I want to store the data in.
Thanks in advance for any suggestions..