Are gene symbol and HGNC symbol the same names for a gene?
1
0
Entering edit mode
5.0 years ago
Sib ▴ 60

I have GB-ACC numbers of differentially expressed genes from GEO2R. But I need gene symbols for entering to enrichr database for further analyses. I used BioMart to convert RefSeq mRNA ID(s) to HGNC symbols. But I am not sure that if RefSeq mRNA ID is GB-ACC? And is HGNC symbol, Gene symbol? (BioMart does not have Gen symbol and GB-ACC options)

gene RNA-Seq • 2.5k views
ADD COMMENT
3
Entering edit mode
5.0 years ago
dsull ★ 6.9k

Yes, HGNC symbol is the gene symbol (for humans) It's called HGNC because the symbols were carefully made by the HUGO gene nomenclature committee, and these symbols are the standards for human genes.

ADD COMMENT
0
Entering edit mode

Thank you. And what about GB-ACC? is it the same as RefSeq mRNA ID?

ADD REPLY
2
Entering edit mode

That is a great question. GenBank accession (GB-ACC) is not the same as RefSeq.

The RefSeq mRNA ID might start with something like NM_ (such as NM_004985).

The GenBank accession numbers follow a different format (as described here: https://www.ncbi.nlm.nih.gov/Sequin/acc.html ). For example, AF493917 would be a GenBank accession ID (note that the GB-ACC doesn't contain an underscore).

A lot of publications confuse the two but GenBank and RefSeq are two separate databases, where GenBank contains sequences submitted by individual labs whereas RefSeq data is curated and maintained by the NCBI.

I prefer RefSeq because GenBank is an archive of a bunch of raw sequences that are dumped into the database so there's a hodgepodge of redundant data and you have to do a fair amount of filtering to get what you want (in fact, RefSeq is largely based off of NCBI manually curating GenBank data). See the RefSeq paper for more information: https://www.ncbi.nlm.nih.gov/pubmed/15608248

ADD REPLY
0
Entering edit mode

Thanks a lot for your answer. As you said I think the GEO2R has confused the two. the image below is the result page of GEO2R.. As you see in the GB-ACC column, different formats like NM_201591, BX100997, BC043554, NR_038236 and etc. are used. I'll be grateful if you show me a way to obtain gene symbols of these genes.

ADD REPLY
2
Entering edit mode

Unfortunately, I can't think of an easy way to do it. Personally, I'd use BioMart to convert all the RefSeq IDs first, and then for the remaining IDs that can't be converted (i.e. the GenBank accession numbers), use the following file from NCBI which maps GenBank accession numbers to gene symbols: ftp://ftp.ncbi.nih.gov/gene/DATA/gene2accession.gz

ADD REPLY
0
Entering edit mode

Thank you.

ADD REPLY

Login before adding your answer.

Traffic: 1086 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6