Entering edit mode
2.7 years ago
iibrams07
▴
10
I have a set of apparently hgnc_symbol
assigned genes that I retrieved with biomart. It turns our that many of the corresponding ensembl id are missing i.e. replaced with NaN
. How it can be that there are no ensembl id for so many genes? Is there a way to find them?
I used the following command in retrieving the data:
results <- getBM(attributes=c("ensembl_gene_id","hgnc_symbol","transcript_biotype"),filters = c("transcript_biotype"), values=list("protein_coding"), mart=ensembl)
Thanks
Different organisations have different rules about what is annotated or not. I am assuming that these genes in question have low evidence of even existing, or relate to obscure non-coding transcripts that may not yet have been reviewed by HGNC. Can you please paste some example IDs?