PDB glossary for three letter codes
1
0
Entering edit mode
3.8 years ago
noodle ▴ 590

Can someone point me to a glossary of all three letter codes one might encounter in a .cif file for RNA and DNA?

There are some standard ones, as outlined in the cif dictionary but there are many more I can find...

PDB cif R bio3d Rpdb • 741 views
ADD COMMENT
1
Entering edit mode
3.8 years ago
noodle ▴ 590

In case someone comes back to this, below is something in R that works;

system("wget http://ligand-expo.rcsb.org/dictionaries/Components-pub.cif")

cif.ref.path <- paste0(getwd(), "/Components-pub.cif")

cif.ref.data <- readLines(cif.ref.path)


data.grep <- grep("data_", cif.ref.data)

all.data <- c()
for(i in 1:(length(data.grep)-1)){
  this.cif <- cif.ref.data[data.grep[i]:data.grep[i+1]]

  this.chem_comp.id <- strsplit(this.cif[grep("_chem_comp.id", this.cif)], " ")[[1]]
  this.chem_comp.id.take <- this.chem_comp.id[length(this.chem_comp.id)]

  this.chem_comp.name <- strsplit(this.cif[grep("_chem_comp.name", this.cif)], " ")[[1]]
  this.chem_comp.name.1 <- this.chem_comp.name[2:length(this.chem_comp.name)]
  this.chem_comp.name.2 <- this.chem_comp.name.1[which(this.chem_comp.name.1 != "")]
  this.chem_comp.name.3 <- str_remove_all(this.chem_comp.name.2, '\"')
  this.chem_comp.name.take <- paste0(this.chem_comp.name.3, collapse = " ")

  this.chem_comp.type <- strsplit(this.cif[grep("_chem_comp.type", this.cif)], " ")[[1]]
  this.chem_comp.type.1 <- this.chem_comp.type[2:length(this.chem_comp.type)]
  this.chem_comp.type.2 <- this.chem_comp.type.1[which(this.chem_comp.type.1 != "")]
  this.chem_comp.type.3 <- str_remove_all(this.chem_comp.type.2, '\"')
  this.chem_comp.type.take <- paste0(this.chem_comp.type.3, collapse = " ")

  this.chem_comp.pdbx_type <- strsplit(this.cif[grep("_chem_comp.pdbx_type", this.cif)], " ")[[1]]
  this.chem_comp.pdbx_type.take <- this.chem_comp.pdbx_type[length(this.chem_comp.pdbx_type)]

  this.data <- cbind(this.chem_comp.id.take, this.chem_comp.name.take, this.chem_comp.type.take, this.chem_comp.pdbx_type.take)
  all.data <- rbind(all.data, this.data)
}


all.data.df <- data.frame(all.data)
colnames(all.data.df) <- c("chem_comp.id", "chem_comp.name", "chem_comp.type", "chem_comp.pdbx_type")
ADD COMMENT

Login before adding your answer.

Traffic: 1660 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6