Hi all
Recently, I deal with some human gene data and I want to know if the gene I handled is conserved gene.
Is any database provide human genes' conservation scores?
I need all human genes' conservation scores.
I know UCSC Genome Table Browser provide sequence conservation scores like phastCons and phyloP. But I can't find gene's conservation score.
Does anyone can teach me how to get it?
The format which I want is
Gene ID 1 (Ensembl ID or Refseq ID ) | conservation score
Gene ID 2 (Ensembl ID or Refseq ID ) | conservation score
Thanks
Hey Yu,
Did you finally obtain the phastCons conservation score for each gene? I would be interested in getting the data for a research project.
Looking forward to hearing from you soon,
All the best,
Jeff O.
Hi Jeff,
Sorry for the delay in replying. I downloaded the UCSC conservation scores (primates phastCons46way) and calculate the mean value of each gene. @Alex Reynolds also provide you a good way to calculate the gene conservation score. Obtaining phastCons conservation score for every gene in the Human genome
Thanks for your answer. But, How did you obtain the corresponding gene symbol for each gene (only the starting locate is indicated)? Further, there are more than 20,000 genes in the primates phastCons46way, meaning that you did not obtain the scores only for the coding genes (which are of interest). Right?
Looking forward to hearing from you soon,
Many thanks, Jeff O.
Hi,
For each gene, the gene transcription start site and transcription terminal site were defined by the ensembl annotation file(gtf). So I only calculate the mean value between transcription start site and transcription terminal site.
If memory serves me right, within the scope of my research, I only calculated no more than 2,000 genes phastCons46way score.