Hi!
I am bit confused, I extracted a list of Pfam domains from UCSC table for hg19 and when I overlapped my genomic co-ordinates to see what I am getting as Pfam domains I got few of them present in overlap. As an example. for EZH2 I had this information from Pfam regarding its domain:
#hg19.refGene.chrom hg19.refGene.cdsStart hg19.refGene.cdsEnd hg19.refGene.name2 hg19.knownToPfam.name hg19.knownToPfam.value hg19.knownToRefSeq.name hg19.knownToRefSeq.value
chr7 148504737 148544390 EZH2 uc003wfb.2 PF00856 uc003wfb.2 NM_004456
chr7 148504737 148544390 EZH2 uc003wfc.2,uc022aov.1, PF00856, uc022aov.1,uc003wfc.2, NM_152998,
chr7 148504737 148544390 EZH2 uc003wfd.2 PF00856 uc003wfd.2,uc011kui.2,uc011kuj.2, NM_001203247,
chr7 148504737 148544390 EZH2 uc011kuh.2 PF00856 uc011kuh.2 NM_001203248
chr7 148504737 148544390 EZH2 uc011kug.2 PF00856 uc011kug.2 NM_001203249
Now when I overlap with my genomic co-ordinate,
chr7 148514466
It is clearly present in this SET domain of Pfam. But this variant leads to an amino acid change at position 420, and when I query PDB database, I get SET domain for EZH2 from 521-746. http://www.rcsb.org/pdb/explore/explore.do?structureId=4MI5
Now, what could be possible reason for this discrepancy? and which data shall I trust?
Please help.
Thank you