Difference between the PFam domain information from UCSC and PDB site
0
0
Entering edit mode
10.0 years ago
ChIP ▴ 600

Hi!

I am bit confused, I extracted a list of Pfam domains from UCSC table for hg19 and when I overlapped my genomic co-ordinates to see what I am getting as Pfam domains I got few of them present in overlap. As an example. for EZH2 I had this information from Pfam regarding its domain:

#hg19.refGene.chrom    hg19.refGene.cdsStart    hg19.refGene.cdsEnd    hg19.refGene.name2    hg19.knownToPfam.name    hg19.knownToPfam.value    hg19.knownToRefSeq.name    hg19.knownToRefSeq.value
chr7 148504737 148544390 EZH2 uc003wfb.2 PF00856 uc003wfb.2 NM_004456
chr7 148504737 148544390 EZH2 uc003wfc.2,uc022aov.1, PF00856, uc022aov.1,uc003wfc.2, NM_152998,
chr7 148504737 148544390 EZH2 uc003wfd.2 PF00856 uc003wfd.2,uc011kui.2,uc011kuj.2, NM_001203247,
chr7 148504737 148544390 EZH2 uc011kuh.2 PF00856 uc011kuh.2 NM_001203248
chr7 148504737 148544390 EZH2 uc011kug.2 PF00856 uc011kug.2 NM_001203249

Now when I overlap with my genomic co-ordinate,

chr7 148514466

It is clearly present in this SET domain of Pfam. But this variant leads to an amino acid change at position 420, and when I query PDB database, I get SET domain for EZH2 from 521-746. http://www.rcsb.org/pdb/explore/explore.do?structureId=4MI5

Now, what could be possible reason for this discrepancy? and which data shall I trust?

Please help.

Thank you

Protein domains SNPs • 2.4k views
ADD COMMENT

Login before adding your answer.

Traffic: 2404 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6