Dear all,
I am attempting to retrieve transcript biotypes for ncRNAs using Bioconductors's biomaRt in GRCh37 as follows:
library(biomaRt)
ensembl <- useMart(biomart="ENSEMBL_MART_ENSEMBL", host="grch37.ensembl.org", dataset="hsapiens_gene_ensembl")
# biotypes for mRNAs are obtained fine
refseqids_nm = c("NM_152486","NM_080605", "NM_031921")
getBM(attributes=c("refseq_mrna", "transcript_biotype"), filters="refseq_mrna", values=refseqids_nm, mart=ensembl)
# refseq_mrna transcript_biotype
#1 NM_031921 protein_coding
#2 NM_080605 protein_coding
#3 NM_152486 protein_coding
# However not for ncRNAs
refseqids_nr = c("NR_015434", "NR_036637")
getBM(attributes=c("refseq_ncrna", "transcript_biotype"), filters="refseq_ncrna", values=refseqids_nr, mart=ensembl)
#[1] refseq_ncrna transcript_biotype
#<0 rows> (or 0-length row.names)
When I try the same as above but with the current release of Ensembl:
ensembl <- useMart(biomart="ENSEMBL_MART_ENSEMBL", dataset="hsapiens_gene_ensembl")
getBM(attributes=c("refseq_ncrna", "transcript_biotype"), filters="refseq_ncrna", values=refseqids_nr, mart=ensembl)
# refseq_ncrna transcript_biotype
#1 NR_015434 antisense
#2 NR_036637 processed_transcript
Then I get biotypes for ncRNAs just fine.
Perhaps there is something I am missing here. Does GRCh37 have annotations for ncRNAs? If so, any input on how I can obtain transcript biotypes using biomaRt as above?
Thanks, Sergio