Fethcing HGNC symbols using R package biomaRt
1
0
Entering edit mode
4.0 years ago

I am trying to collect the HGNC symbols for genes after some high throughput RNAseq, but the syntax isn't functioning, can anyone pick out the error or tell me how to do this please?

Running biomaRt, on R/4.0.0

my syntax

  dds_covid_df<- sapply( strsplit( rownames(dds_covid), split="\\+" ), "[", 1 )
  ensembl = useMart("ENSEMBL_MART_ENSEMBL",dataset="hsapiens_gene_ensembl", )
  genemap <- getBM( attributes = c("ensembl_gene_id_version", "hgnc_symbol"), filters = "ensembl_gene_id_version", values = dds_covid_df, mart = ensembl)
                    filters = "ensembl_gene_id",
                    values = m,
                    mart = ensembl )
  idx <- match( dds_covid_df, genemap$ensembl_gene_id )
dds_covid$hgnc_symbol <- genemap$hgnc_symbol[ idx ]

I am trying to collect the HGNC symbols for genes after some high throughput RNAseq, but the syntax isn't functioning, can anyone pick out the error or tell me how to do this please?

dds_covid is my dataframe

Running biomaRt, on R/4.0.0

my syntax

dds_covid_df<- sapply( strsplit( rownames(dds_covid), split="\\+" ), "[", 1 )
  ensembl = useMart("ENSEMBL_MART_ENSEMBL",dataset="hsapiens_gene_ensembl", )
  genemap <- getBM( attributes = c("ensembl_gene_id_version", "hgnc_symbol"), filters = "ensembl_gene_id_version", values = dds_covid_df, mart = ensembl)
                    filters = "ensembl_gene_id",
                    values = m,
                    mart = ensembl )
  idx <- match( dds_covid_df, genemap$ensembl_gene_id )
dds_covid$hgnc_symbol <- genemap$hgnc_symbol[ idx ]

my results

 Gene_ID hgnc_symbol
1    ENSG00000242268.3          NA
2    ENSG00000270112.4          NA
3    ENSG00000280143.1          NA
4   ENSG00000146083.12          NA
5    ENSG00000263642.1          NA
6    ENSG00000225275.4          NA
7   ENSG00000158486.13          NA
8    ENSG00000283967.1          NA
9    ENSG00000273639.6          NA
R biomaRt RNA-Seq • 1.0k views
ADD COMMENT
1
Entering edit mode
4.0 years ago

Hi,

You just need to 'knock off' (remove) that number at the end of each Ensembl gene ID, which relates to the ID version (I think). Something like:

sub('\\.[0-9]*$', '', m)

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 1523 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6