TopGO Error in .local(.Object, ...) : allGenes must be a factor with 2 levels
0
0
Entering edit mode
6.0 years ago
jomagrax ▴ 40

Hi Im Jose! This is my first time using TopGO and Im having problems generating the GOdata object in R, thank you all in advance, This is the code Im using

 # 1. Data preparation: List of genes identifiers, gene scores, list of differentially expressed genes,  gene-to-GO annotations are all collected and stored in a single R object.

 > annot_GO <- read_delim("E:/VESCA/gen_GO.txt", "\t", escape_double = FALSE, col_names = FALSE, trim_ws = TRUE)
Parsed with column specification:
cols(
  X1 = col_character(),
  X2 = col_character()
)

> annot_GO
# A tibble: 32,832 x 2
   X1                    X2                              
   <chr>                 <chr>                           
 1 locusName             GO                              
 2 gene00090-v1.0-hybrid NA                              
 3 gene00091-v1.0-hybrid GO:0003677,GO:0046983                                       
# ... with 32,822 more rows



 > # create a list of GO terms

    > geneID2GO <- as.list(as.character(annot_GO$X1)) # generates list; element names are transcript IDs
    > geneID2GO <- as.list(setNames(as.character(annot_GO$X2), as.character(annot_GO$X1))) # adds Gene Ontology data to list
    > geneID2GO <- lapply(geneID2GO, function(x) unlist(strsplit(x, split="[,]"))) # split single GO terms string into a character vector, one element per term
    > str(head(geneID2GO))
    List of 6
     $ locusName            : chr "GO"
     $ gene00090-v1.0-hybrid: chr NA
     $ gene00091-v1.0-hybrid: chr [1:2] "GO:0003677" "GO:0046983"



> # make full list of transcript names, geneNames

> geneNames <- names(geneID2GO)
> head(geneNames)
[1] "locusName"             "gene00090-v1.0-hybrid" "gene00091-v1.0-hybrid" "gene00092-v1.0-hybrid" "gene00093-v1.0-hybrid"
[6] "gene00094-v1.0-hybrid"


> head(MyInterestingGenes1)
    [1] "77981546__"           "CL11544Contig1__"     "CL3CG7R__"            "CL8558Contig1__"      "contig00421__258___5"
    [6] "contig01716__233___6"


> #List of all genes
> geneList_1 <- factor(as.integer(geneNames %in% MyInterestingGenes1)) 
> str(geneList_1) 
 Factor w/ 1 level "0": 1 1 1 1 1 1 1 1 1 1 ...
> head(geneList_1)
            locusName gene00090-v1.0-hybrid gene00091-v1.0-hybrid gene00092-v1.0-hybrid gene00093-v1.0-hybrid 
                    0                     0                     0                     0                     0 
gene00094-v1.0-hybrid 
                    0 
Levels: 0


> #Creation of "GOdata object"
> GOdata_1 <- new("topGOdata", ontology = "MF", allGenes = geneList_1, annot = annFUN.gene2GO, nodeSize=5, gene2GO = geneID2GO)
Error in .local(.Object, ...) : allGenes must be a factor with 2 levels

"MyInterestingGenes1" come from a DESeq2 analysis after a kallisto mapping

As much as I know, I understand that the problem is that none of the genes in "MyInterestingGenes1" match with the ones in "geneNames" thats why the factor "geneList_1" don't have any level.

Perhaps you can help me to figure this out.

RNA-Seq R TopGO • 3.1k views
ADD COMMENT
0
Entering edit mode

Note that topGO expects that what you called geneNames is a large set, which comprises several genes, including those present in geneList_1. In your case looks like the two objects are totally different, that's why topGO doesn't work. Maybe you can look to this thread (which was related to antoher issue) and try to reproduce it to get acquainted to the way of operating of topGO.

ADD REPLY

Login before adding your answer.

Traffic: 2593 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6