Get Chromosome Annotation
1
0
Entering edit mode
3.5 years ago
Bine ▴ 90

Dear all,

So far I was able to fetch the Gene Name and Gene Entrez for my genes of interest with

> top0$Gene_Name <- mapIds(org.Hs.eg.db,
>                          keys=ens.str,
>                          column="SYMBOL",
>                          keytype="ENSEMBL",
>                          multiVals="first")

top0$Gene_Entrez <- mapIds(org.Hs.eg.db,
                           keys=ens.str,
                           column="ENTREZID",
                           keytype="ENSEMBL",
                           multiVals="first")

But I am struggling to get the Chromosome information for these ones. Does anyone has an idea how I could do that?

Thank you very much, Bine

R annotation • 2.6k views
ADD COMMENT
2
Entering edit mode
3.5 years ago
Nitin Narwade ★ 1.6k

Why don't you try BioMart API??

mart <- useMart('ENSEMBL_MART_ENSEMBL', host='www.ensembl.org')
mart <- useDataset("hsapiens_gene_ensembl", mart)
annotLookup <- getBM(mart = mart, attributes = c('chromosome_name', 'start_position', 'end_position', 'strand', 'ensembl_gene_id', 'gene_biotype', 'hgnc_symbol'), filter = 'ensembl_gene_id', values = listOfInputENSIds, uniqueRows = TRUE)
ADD COMMENT
0
Entering edit mode

Thank you but running this gives me

 "Error in useMart("ENSEMBL_MART_ENSEMBL", host = "www.ensembl.org") : 
  could not find function "useMart"

Any idea why I am getting this error?

ADD REPLY
0
Entering edit mode

It worked now!!! I dont know why i got this error earlier.

Thank you so much :) :)

ADD REPLY
0
Entering edit mode

Probably you forgot to load the package

library(biomaRt)
ADD REPLY
0
Entering edit mode

One additional question on this: I am now getting the following error with above code. Do you know what could be the reason?

list$annotLookup <- getBM(mart = mart, attributes = c('chromosome_name', 'hgnc_symbol'), filter = 'ensembl_gene_id', values = ens.str, uniqueRows = TRUE)

Error in `[[<-`(`*tmp*`, name, value = list(chromosome_name = c("Y", "19",  : 
  49 elements in value to replace 50 elements

Thanks so much, Bine

ADD REPLY
1
Entering edit mode

I am not sure about this error, it seems a very generalized error to me.

Could you run the command without assigning it to any variable?

ADD REPLY
0
Entering edit mode

Thank you very much. Interestingly the error does not appear then. But somehow I need to add these values to my "list".. I wonder how else I could do that..

ADD REPLY
0
Entering edit mode

If you don't mind can I see your complete code?

I mean at least from the list declaration to the assignment of annotation dataframe.

ADD REPLY
0
Entering edit mode

Ah it seems that my "list" has 50 rows whereas my "annotLookup" list

annotLookup <-  getBM(mart = mart, attributes = c('chromosome_name', 'start_position', 'end_position', 'strand', 'ensembl_gene_id', 'gene_biotype', 'hgnc_symbol'), filter = 'ensembl_gene_id', values = ens.str, uniqueRows = TRUE) 

has 49 rows.

But annotLookup should have 50 rows since I said

ens.str <- substr(top0$Gene_ID, 1, 50)  

I dont understand why it has only 49...

Do you have an idea?

Thanks so much!

ADD REPLY
1
Entering edit mode

Honestly, I do not have any clue why you are doing this,

ens.str <- substr(top0$Gene_ID, 1, 50)

And for creating a list of annotations I would use something like this:

listOfAnno = list()
listOfAnno$anno1 = annotLookup
listOfAnno$anno2 = annotLookup
.....
listOfAnno$annoN = annotLookup

If you are using for loop and I assume you have a list of ens_ids then,

listOfAnno = list()

for (i in seq(1, length(listOfEnsIds))) {   #OR you could you names instead of indices, if you have named list
    annotLookup <-  getBM(mart = mart, attributes = c('chromosome_name', 'start_position', 'end_position', 'strand', 'ensembl_gene_id', 'gene_biotype', 'hgnc_symbol'), filter = 'ensembl_gene_id', values = listOfEnsIds[[i]], uniqueRows = TRUE) 
    listOfAnno[[i]] = annotLookup
}

OR you can run this operation using apply family functions (lapply).

ADD REPLY
0
Entering edit mode

I do this ens.str <- substr(top0$Gene_ID, 1, 50) to limit my list to the top 50 genes. The list is much longer.

I still dont understand why I only get 49 then... I am still not able to combine list and annotLookup due to the difference (49/50).

ADD REPLY
0
Entering edit mode

The same happens if I use another variable. I have only 48 genes in annotLookup, even though I am asking for the top 50 genes...

ADD REPLY
0
Entering edit mode

I guess to select the top 50 genes substr would not help you. substr is for getting the substring of a character vector, it will trim the string to a specified length.

What I think about getting the less number of records (i mean less than the input ids) is, there could be few Ensemble Ids that are obsolete. Did you check the assembly version you are using?

ADD REPLY
0
Entering edit mode

Please use ADD REPLY when responding to existing comments to keep threads in logical order.

ADD REPLY

Login before adding your answer.

Traffic: 2621 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6