Why Does the Same Compound Have Different CIDs in PubChem?
0
1
Entering edit mode
12 months ago
Elka ▴ 10

Hello BioStar Community,

I've encountered a curious situation in the PubChem database and am seeking insights from the community. I noticed that the same compound can be listed under different Compound IDs (CIDs), despite having identical chemical properties and identifiers, including the same IUPAC name, InChI Key, and chemical properties.

Example 1: The compound with the IUPAC name "stannous fluoride" (tin(II) fluoride) appears under two different CIDs:

CID 24550 with Canonical SMILES: F[Sn]F CID 10197804 with Canonical SMILES: [F-].[F-].[Sn+2] Both entries have the same molecular formula (F2Sn) and InChI Key, but differ in their structural representation.

Example 2: Another case involves the compound with the name Sulfaguanol, which appears under three different CIDs:

CID 65756 - https://pubchem.ncbi.nlm.nih.gov/compound/65756 CID 9571041 - https://pubchem.ncbi.nlm.nih.gov/compound/9571041 CID 5464101 - https://pubchem.ncbi.nlm.nih.gov/compound/5464101 In this case, all three entries share the same molecular formula, IUPAC name, InChI Key, and chemical properties.

These examples raise a question: why does PubChem list the same chemical entity under different CIDs, given that their core properties are identical? Is this differentiation based on structural representation (ionic vs covalent for the first example), or is there another reason for the distinction? How does PubChem generally handle such cases where the differences are mainly in representation rather than in chemical composition?

Thank you for any insights or explanations you can provide on this matter.

PubChem • 865 views
ADD COMMENT
1
Entering edit mode

ChatGPT had the following two things regarding different CID's. See if they apply.

  • PubChem distinguishes between different stereochemical forms of a compound by assigning them different CIDs.
  • Different representations or formulations of the same chemical entity, such as salts or hydrates, may also be assigned different CIDs in PubChem.
ADD REPLY
0
Entering edit mode

Thank you for the answer! However, my background is not in chemistry, so I am unable to fully verify these points on my own.

I am working on a project where the goal is to find PubChem CIDs for as many ChEBI IDs as possible. I have identified UniChem, ChEBI, and PubChem as potential data sources but In the data provided by UniChem, I have noticed that thousands of CHEBI IDs are mapped to more than one CID. I want to understand why this happens because I'm looking for the most suitable data source.

Any advice on how to best match CHEBI IDs with the correct PubChem CIDs would be really helpful.

ADD REPLY
0
Entering edit mode

If you have a ChatGPT account (you can get a free one) then ask it to "match CHEBI IDs with the correct PubChem CIDs". That should get you moving until someone answers.

ADD REPLY

Login before adding your answer.

Traffic: 1000 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6