reactomePA error input file
1
0
Entering edit mode
4.5 years ago
camillab. ▴ 160

Hi, I am relatively new to R so apologies if the code/question is not in the right format! I will to improve! I am trying to perform enrichment analysis with reactomePA (R package) on a smaller list of genes (477) and I have a problem with organizing the dataset. As far as I understood the input file should contain only two-column: Entrez ID (column n.1) and fold change (column n.2). I converted the ensemble ID with Biomart with the online tool, and then created a new file with ID and FC. My dataset:

# A tibble: 6 x 2
  Entrezgene_ID log2fc
  <chr>          <dbl>
1 14             -1.02
2 80755          -1.45
3 60496          -1.17
4 6059           -1.48
5 10061          -1.35
6 10006          -1.51

Then I was trying to following this code:

#load packages
library(org.Hs.eg.db)
library(DOSE)
library(ReactomePA)

 ## feature 1: numeric vector
geneList <- d[,2]

 ## feature 2: named vector
 names(geneList) <- as.character(d[,1])

## feature 3: decreasing order
geneList<- sort(geneList, decreasing = TRUE)
 head(geneList)

But when I try to name the vector I obtain with a list of entrez ID separated by comma and no FC (geneList: 477 obs, 1 variable c ("14", "80755",... and so on). I was expecting to found then in rows next to the fold change, Am I wrong? and of course if I try to run to organise in decreasing order ( "feature 3") I got this error because of course I have basically a list of number included in the " " not associated with any numbers :

Error: Can't subset columns that don't exist. x Locations 141, 373, 119, 229, 230, etc. don't exist. i There are only 1 column.

Thank you very much for your help!

Camilla

RNA-Seq R reactomePA error vector • 2.2k views
ADD COMMENT
0
Entering edit mode
4.5 years ago
russhh 5.7k

unlike when you subset a data-frame, when you subset a tibble using tbl[, col] syntax, you always receive a tibble. For a data.frame, extracting a single column in this way would return a vector. What you've done is extract geneList as a Nx1 tibble, and tried to set the names on that.

To extract a vector from a tibble, use genes <- tbl[[col]] syntax, and then use names(genes) <- tbl[[other_col]]

ADD COMMENT
1
Entering edit mode

Plus, if you already have tibble loaded, you can use it's deframe function to do this in one step: https://stackoverflow.com/a/56479548/1845650 ; genes <- tibble::deframe(d) . That only works for two-column data-frames: the first column becomes the vector-names and the second column becomes the vector-contents

ADD REPLY
0
Entering edit mode

There are several other ways of doing this mentioned in that SO thread, dplyr::pull for example

ADD REPLY
0
Entering edit mode

so if I use

genes <- tibble::deframe(d)

I should obtain already a vector with my values and their associated "names"?

ADD REPLY
0
Entering edit mode

Hi! Thank you! so the code should be:

#load packages
library(org.Hs.eg.db)
library(DOSE)
library(ReactomePA)

## feature 1: numeric vector
genes <- d[[,2]]

## feature 2: named vector
names(genes) <- d[[,1]]

## feature 3: decreasing order
geneLIST <- sort(geneList, decreasing = TRUE)
head(geneList)

x <- enrichPathway(gene=geneLIST,pvalueCutoff=0.05, readable=T)

because it gives me this error when I use [[ ]] :

Error: Subscript can't be missing for tibbles in [[.

and if I do :

## feature 1: numeric vector
genes <- d[,2]

## feature 2: named vector
names(genes) <- d[,1]

I have the same results (and error) that before.

ADD REPLY
1
Entering edit mode

No, the code should be genes <- d[[2]]; names(genes) <- d[[1]]

ADD REPLY
1
Entering edit mode

On a tibble or data.frame, the [[ function extracts a column as a vector: `[[`(my_df, column_index). It takes the data.frame and a column index as argument; so when it's used as an operator it should look like my_df[[column_index]] (not my_df[[, column_index]])

ADD REPLY

Login before adding your answer.

Traffic: 1962 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6