Running pathfindR package generates an error message
1
0
Entering edit mode
4.0 years ago
peter.berry5 ▴ 60

I'm using the pathfindR package to perform PSEA on a set of DE proteins from a proteomics experiment. If I understand the package correctly one of the outputs should be a dataframe which I have called PSEA_results.

When I run the package that dataframe doesn't get created and I get the following error message.

Error in (function (filename = "Rplot%03d.png", width = 480, height = 480, : invalid 'filename'

This is a new one on me. My code is below.

library ("KEGGREST")
library ("KEGGgraph")
library ("org.Hs.eg.db")
library ("pathfindR") 
library ("tidyverse")
library ("tidyr")
library ("dplyr")
library ("magrittr")

#remove all previous file
rm(list=ls())

#data summary needs pvalues added

#Data Summary of Progenesis Data, Enriched GO Terms and Enriched KEGG Pathways
attached_CHO_all_data <- read.csv("C:/PETER PROJECT/3d. Mass Spec Processing/Results of R Analysis/Data Summary/Effect of Culture Format/WCL Analysis/DE Cutoff equal 1.2/0% Attached v 0% Suspension/DE 1-2 0% Attached v Susp WCL Apr19 (CHO Database) - All Data.csv")

A_v_S_Comp_0rd <- read.csv("C:/PETER PROJECT/3d. Mass Spec Processing/Results of R Analysis/DE Normalised Abundance Data/Effect of Culture Format/WCL Analysis/DE Cutoff equal 1.2/0% Attached v 0% Suspension/DE 1-2 0% Attached v Susp Normalised Abundance Data WCL Apr19.csv")
colnames(A_v_S_Comp_0rd)

#find the average of each row relating to the 4 replicates of attached cells
Aver_A0 <- rowMeans(data.frame(A_v_S_Comp_0rd$A0b, A_v_S_Comp_0rd$A0c, A_v_S_Comp_0rd$A0d))
Aver_A0 <- data.frame(A_v_S_Comp_0rd$description, Aver_A0)

#find the average of each row relating to the 4 replicates of suspension cells
Aver_S0 <- rowMeans(data.frame(A_v_S_Comp_0rd$S0a, A_v_S_Comp_0rd$S0b, A_v_S_Comp_0rd$S0c, A_v_S_Comp_0rd$S0d))
Aver_S0 <- data.frame(A_v_S_Comp_0rd$description, Aver_S0)

#Divide each row of dataframe Aver_S5 by the corresponding row in dataframe Aver_A5 
FC <- cbind(Aver_S0[1],round(Aver_S0[-1]/Aver_A0[-1],3))
names(FC)[2] = "FC"

#Get Log2 0f A0/S0            
Log2FC = log2(FC$FC)

#select desired columns
PSEA_1<-  data.frame(A_v_S_Comp_0rd$description, Log2FC, A_v_S_Comp_0rd$Anova..p)

#spilt column 1 in the dataframe PSEA_rd into 2 separate columns using the phrase refseq: as the spilt and call it PSEA_2
PSEA_2<- tidyr::separate (PSEA_1, 1, into=c("HGNC_Symbol", "Refseq"), sep = "Refseq:")

#further spilt column 6 in the dataframe PSEA_2 into 2 separate columns using = as the spilt and call it v7
PSEA_3<- tidyr::separate (PSEA_2, 1, into=c("gene", "HGNC_Symbol"), sep = "=") 

#remove unwanted columns
PSEA_4 = subset(PSEA_3, select = -c(gene,Refseq))
PSEA_rd <- distinct(PSEA_4,HGNC_Symbol, .keep_all= TRUE)

#trim white space from protein names and make a dataframe
PSEA_rd$HGNC_Symbol <- trimws(PSEA_rd$HGNC_Symbol, which = "right", whitespace = "[ \t\r\n]")

#rename columns and sort on Gene.Symbol
names(PSEA_rd)[1] <- "Gene.Symbol"
names(PSEA_rd)[2] <- "LogFC"
names(PSEA_rd)[3] <- "adj.P.Val"
PSEA_rd <- arrange(PSEA_rd, Gene.Symbol, by_group = FALSE)
head(PSEA_rd)
str(PSEA_rd)

# PSEA_rd is the raw data file for PSEA
data(PSEA_rd)

#set a directory to store the results in. The default directory is is pathfindeR_Results
getwd()

PSEA_results <- run_pathfindR(PSEA_rd, output_dir = ("C:/PETER PROJECT/3d. Mass Spec Processing/Results of R Analysis/pathfindR/Culture Format/WCL/0% Attach v 0% Susp")) 

    #write.csv (PSEA_results, file = "C:/PETER PROJECT/3d. Mass Spec Processing/Results of R Analysis/pathfindR/Effect of Culture Format/WCL Analysis/0% Attached v 0% Suspension/DE Cutoff equal 1.2/PSEA_Attached_0.csv")

The data frame PSEA_rd has over 1000 lines in it so I don't propose to reproduce it here. However, just so people can see the types of data it contains the output from str(PSEA_rd) is below.

'data.frame':   1171 obs. of  3 variables:
 $ Gene.Symbol: chr  "AAAS" "AACS" "AARS" "AATF" ...
 $ LogFC      : num  -0.648 0.82 1.837 -0.492 -1.114 ...
 $ adj.P.Val  : num  7.15e-05 4.59e-03 4.71e-05 1.38e-02 1.00e-02 ...

I did try one other thing. I ran

PSEA_results <- run_pathfindR(PSEA_rd)

as per this post Enrichment Analysis, Clustering and Scoring with pathfindR. The result was the same error message which hopefully rules out that the issue is caused by the directory structure I want to store the data in.

Thanks in advance for any suggestions..

pathfindR R • 970 views
ADD COMMENT
1
Entering edit mode
4.0 years ago
peter.berry5 ▴ 60

This can be closed. I've managed to fix the problem. The columns in the PSEA_rd dataframe needed to be called

Gene.symbol not Gene.Symbol

and logFC not LogFC

ADD COMMENT

Login before adding your answer.

Traffic: 1806 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6