Could not find much on this. Any leads will be appreciable.
I am assuming this gene is submitted but under review at transcript level, hence we do not find any associated gene symbol, but I may be wrong. Any insights will be very much appreciated.
P.S.: Posted the link of this Biostar query in Twitter to see if someone responds to the thread.
Take a look at this UniProt blast search and the second hit. There are some other larger protein hits. The protein link you posted is likely a fragment, correct?
A tblastn search at NCBI results in a single hit to a non-coding RNA (once you filter the results for human).
Interesting... I see the source of the Uniprot entry seems to be automated translation of the cDNA you linked from this 1993 pub (see figure 1). Paper suggests chain encoded by "HK102", a vK1 chain gene (as it was named in 1993?). There are many high-homology hits to this cDNA sequence in the NCBI EST database. No high-homology full length hits to the cDNA in refseq human, but some to other organisms, all of which are other immunoglobulin chains. I'm no immunologist but this sequence contains variable regions (CDRs) so I guess using homology to map to a well-annotated gene entry may be challenging.
Yes, hypothetically it can be a suspected variation of P0DOX7 but cannot be assertive at this point. Not much evidence at transcriptional level though. Thanks again.
Yes, it is a fragment. Interesting find that it is mostly non-coding hit. I reckon this does not qualify for me farther since my protein component association needs to be with a protein coding gene.
This is a TrEMBL entry, meaning an unreviewed sequence with low quality and scored poorly according to UniProt association score rules. It gives no results in Ensembl (at least for human), Entrez Gene or Open Targets Platform, which leads me to believe it's not been used for gene/transcript annotation.
I have also downloaded all the genes associated with rheumatoid arthritis hoping to get the gene name picked up from text mining or elsewhere but no luck. And LINK (that allows us to explore connections for genes from millions of PubMed abstract documents and semantic relationships) but still no luck. You may want to delve deeper in the relations for Rheumatoid only in LINK though.
How about contacting UniProt helpdesk to clarify on this?
A very comprehensive answer . Thanks for taking time to make the interpretations. I agree to your analogy and did have a feeling that it has not been used for gene/transcript annotation. Although I am not an expert in protein annotation so wanted to get insights from experts.
Thanks everyone for the insightful interpretations and answers.
According to UniProt curators, the cDNAs described in that paper (PubMed=7916589) most probably code for immunoglobulins that authors isolated as "Rheumatoid factors". They have not been manually curated in UniProtKB/Swiss-Prot for the reasons described here: https://www.uniprot.org/help/immunoglobulins. You cannot find any specific gene associated with these entries because each immunoglobulin is basically encoded by 7 different genes that encode different regions of the full length protein. Moreover, combinatorial V-(D)-J diversity, junctional diversity and somatic hypermutations, the mechanisms that ensure diversity during immunoglobulin synthesis, make it impossible to exactly trace back the germline genes coding for a given immunoglobulin.
If you need more details, please do not hesitate to contact the UniProt helpdesk.
Thank you very much for taking time to clarify. Appreciate the detailed explanations. Currently I will not be pursuing this fragment anymore in light of the explanations I have received, if needed for any farther requirements, will reach out to the UniProt helpdesk as advised.
Take a look at this UniProt blast search and the second hit. There are some other larger protein hits. The protein link you posted is likely a fragment, correct?
A tblastn search at NCBI results in a single hit to a non-coding RNA (once you filter the results for human).
Interesting... I see the source of the Uniprot entry seems to be automated translation of the cDNA you linked from this 1993 pub (see figure 1). Paper suggests chain encoded by "HK102", a vK1 chain gene (as it was named in 1993?). There are many high-homology hits to this cDNA sequence in the NCBI EST database. No high-homology full length hits to the cDNA in refseq human, but some to other organisms, all of which are other immunoglobulin chains. I'm no immunologist but this sequence contains variable regions (CDRs) so I guess using homology to map to a well-annotated gene entry may be challenging.
I suspect it is a variation on P0DOX7
Yes, hypothetically it can be a suspected variation of P0DOX7 but cannot be assertive at this point. Not much evidence at transcriptional level though. Thanks again.
Yes, I agree to your interpretations. Thanks for spending time to provide such useful information.
Yes, it is a fragment. Interesting find that it is mostly non-coding hit. I reckon this does not qualify for me farther since my protein component association needs to be with a protein coding gene.