Question

Is there a HGNC symbol or any ENSEMBLE/RefSeq ID for this UniProt : Rheumatoid factor C6 light chain?

1

Entering edit mode

6.0 years ago

ivivek_ngs ★ 5.2k

Here is the link to the UniProt ID A0N5G1_HUMAN: https://www.uniprot.org/uniprot/A0N5G1

I could maximum find this :https://www.ebi.ac.uk/ena/data/view/AAB25742\

Could not find much on this. Any leads will be appreciable. I am assuming this gene is submitted but under review at transcript level, hence we do not find any associated gene symbol, but I may be wrong. Any insights will be very much appreciated.

P.S.: Posted the link of this Biostar query in Twitter to see if someone responds to the thread.

gene mapping HGNC UniProt Ensembl • 2.7k views

ADD COMMENT • link updated 6.0 years ago by Elisabeth Gasteiger ★ 2.4k • written 6.0 years ago by ivivek_ngs ★ 5.2k

2

Entering edit mode

Take a look at this UniProt blast search and the second hit. There are some other larger protein hits. The protein link you posted is likely a fragment, correct?

A tblastn search at NCBI results in a single hit to a non-coding RNA (once you filter the results for human).

ADD REPLY • link 6.0 years ago by GenoMax 150k

2

Entering edit mode

Interesting... I see the source of the Uniprot entry seems to be automated translation of the cDNA you linked from this 1993 pub (see figure 1). Paper suggests chain encoded by "HK102", a vK1 chain gene (as it was named in 1993?). There are many high-homology hits to this cDNA sequence in the NCBI EST database. No high-homology full length hits to the cDNA in refseq human, but some to other organisms, all of which are other immunoglobulin chains. I'm no immunologist but this sequence contains variable regions (CDRs) so I guess using homology to map to a well-annotated gene entry may be challenging.

ADD REPLY • link 6.0 years ago by Ahill ★ 2.0k

0

Entering edit mode

I suspect it is a variation on P0DOX7

ADD REPLY • link 6.0 years ago by me ▴ 760

0

Entering edit mode

Yes, hypothetically it can be a suspected variation of P0DOX7 but cannot be assertive at this point. Not much evidence at transcriptional level though. Thanks again.

ADD REPLY • link 6.0 years ago by ivivek_ngs ★ 5.2k

0

Entering edit mode

Yes, I agree to your interpretations. Thanks for spending time to provide such useful information.

ADD REPLY • link 6.0 years ago by ivivek_ngs ★ 5.2k

0

Entering edit mode

Yes, it is a fragment. Interesting find that it is mostly non-coding hit. I reckon this does not qualify for me farther since my protein component association needs to be with a protein coding gene.

ADD REPLY • link 6.0 years ago by ivivek_ngs ★ 5.2k

2

Entering edit mode

6.0 years ago

Elisabeth Gasteiger ★ 2.4k

According to UniProt curators, the cDNAs described in that paper (PubMed=7916589) most probably code for immunoglobulins that authors isolated as "Rheumatoid factors". They have not been manually curated in UniProtKB/Swiss-Prot for the reasons described here: https://www.uniprot.org/help/immunoglobulins. You cannot find any specific gene associated with these entries because each immunoglobulin is basically encoded by 7 different genes that encode different regions of the full length protein. Moreover, combinatorial V-(D)-J diversity, junctional diversity and somatic hypermutations, the mechanisms that ensure diversity during immunoglobulin synthesis, make it impossible to exactly trace back the germline genes coding for a given immunoglobulin.

If you need more details, please do not hesitate to contact the UniProt helpdesk.

ADD COMMENT • link 6.0 years ago by Elisabeth Gasteiger ★ 2.4k

0

Entering edit mode

Thank you very much for taking time to clarify. Appreciate the detailed explanations. Currently I will not be pursuing this fragment anymore in light of the explanations I have received, if needed for any farther requirements, will reach out to the UniProt helpdesk as advised.

ADD REPLY • link 6.0 years ago by ivivek_ngs ★ 5.2k

score 5 · Accepted Answer · 2019-04-12

Got your query from Twitter :)

This is a TrEMBL entry, meaning an unreviewed sequence with low quality and scored poorly according to UniProt association score rules. It gives no results in Ensembl (at least for human), Entrez Gene or Open Targets Platform, which leads me to believe it's not been used for gene/transcript annotation.

I have also downloaded all the genes associated with rheumatoid arthritis hoping to get the gene name picked up from text mining or elsewhere but no luck. And LINK (that allows us to explore connections for genes from millions of PubMed abstract documents and semantic relationships) but still no luck. You may want to delve deeper in the relations for Rheumatoid only in LINK though.

How about contacting UniProt helpdesk to clarify on this?