TopGO: how to select specific genes from the bacgkround for the analysis
1
0
Entering edit mode
4.1 years ago
Oli • 0

Hello community,

I need some help to make TopGO do what i need. I have intersected the differentially expressed genes in my RNA-seq experiment with a list of genes I'm interesed to for other reasons. When i create the TopGO object i have some troubles with the selection function. For the whole list of differentially expressed genes i select only those with a p-value < 0.05. What if i want to perform the enrichment analysis on the genes from the second list using the DGE from my experiment as the background?

My code is this:

background <- geneuniverse$PValue
names(background) <- geneuniverse$ensembl_gene_id
selection <- function(allScore){ return(allScore < 0.05)}
GoData <- new("topGOdata",
              ontology = "BP",
              allGenes = background,
              geneSel = selection,
              annot = annFUN.org,
              mapping = "org.Hs.eg.db", 
              ID = "ensembl",
              nodeSize = 3)

If i read my genes of interest in a vector by doing:

interesting_genes <- scan('interesting_genes.txt', header=TRUE)

How can I get TRUE/FALSE for matches between interesting genes and background? I have already tried match, %in% and grepl, but geneSel can only be a function. Any help?

I dont' know if it's of any help, but heres how the data look:

head(background)
ENSG00000183242 ENSG00000248347 ENSG00000172482 ENSG00000120068 ENSG00000137558 
   1.81e-15        1.15e-08        5.59e-09        5.33e-07        4.86e-06      
head(interesting_genes)
[1] "ENSG00000114779" "ENSG00000143322" "ENSG00000123130" "ENSG00000103740" "ENSG00000164398"
R GeneOntology TopGO rna-seq • 1.9k views
ADD COMMENT
0
Entering edit mode
4.1 years ago
e.rempel ★ 1.1k

Hi Oli,

you could specify allGenes as a factor (length equals number of significant genes in the background). This factor equals 1 if the gene is also in * interesting_genes* and 0 otherwise.

background_genes = names(background)[background <= 0.05]
int.genes <- factor(as.integer(background_genes %in% interesting_genes))
names(int.genes) = background_genes
GoData <- new("topGOdata",
          ontology = "BP",
          allGenes = int.genes,
          annot = annFUN.org,
          mapping = "org.Hs.eg.db", 
          ID = "ensembl",
          nodeSize = 3)

So you are just limiting universe of genes to background_genes.

ADD COMMENT

Login before adding your answer.

Traffic: 1693 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6