I would like to know it there is any way to map a list of drug/chemical names to their corresponding CAS RN, if only names of drug/chemicals are known.
I would like to know it there is any way to map a list of drug/chemical names to their corresponding CAS RN, if only names of drug/chemicals are known.
PubChem have these in the CID synonym lists. They are somewhat noisy because the name and RN mappings done by different secondary sources (in the SIDs) are not necessarily concordant with SciFinder direct (salt-parent cross-overs are common in fact) But why not use InChIKey ? these are openly provenanced.
You can get some background here http://www.slideshare.net/cdsouthan/southan-bio-it2012drugnames
To be more specific, examine this file: ftp://ftp.ncbi.nlm.nih.gov/pubchem/Compound/Extras/CID-Synonym-unfiltered.gz
If your drug is in the Therapeutic Target Database, then you can use their file of cross-references to get the CAS number (along with other identifiers).
Unfortunately, I never found anything that did that. I ended up using UniProt accessions as my 'true' ID, and mapped TTD target IDs and NCBI gene IDs to UniProt accessions (along with all other IDs I needed). Not perfect, but going between UniProt and NCBI gene IDs was relatively simple.
The question is which mapping you trust more/has greater coverage. Do you trust UniProt to record the NCBI IDs more accurately and completely, or NCBI to record the UniProt accession more accurately and completely. The way I did it was to go from all my sources to UniProt. In the case of NCBI gene IDs I believe I used the gene2accession file (ftp://ftp.ncbi.nih.gov/gene/DATA/) to map from Gene IDs to UniProt accessions.
One good source is WikiPedia, which has been working with Chemical Abstracts, resulting in the http://commonchemistry.org/ project.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Singh, what numbers are you talking about and what do you want to do with the CAS RNs when youve got them ?