About Human Protein Coding Gene Copy Number Across Genome
2
0
Entering edit mode
12.2 years ago
Free Man ▴ 180

I just want to obtain the information of human protein coding gene copy number. For each gene, whether it is a single copy gene or duplicated gene across chromosomes. What should I do? Is that such a database for all human protein coding gene? Thank you!

• 3.0k views
ADD COMMENT
3
Entering edit mode
12.2 years ago
VS ▴ 740

Goto Ensembl Biomart ,

  1. choose Human genes dataset
  2. choose filter to be gene-->genetype-->protein_coding
  3. choose attribute to be Homologs-->Paralogs--> Homology Type ++

So, what I mean by ++ above is that you can choose other paralogy attributes such as %identity etc. You can download these results and then polish up the results to remove redundancy (ex. paralogs X and Y will be reported twice : geneX-geneY, geneY-geneX) and filter further as per your needs.

ADD COMMENT
0
Entering edit mode

This is good if you want to know just about whether genes have paralogs in the genome or not. Which I am assuming the OP wants to address. Also adding info for copy-number variants would be more complicated.

ADD REPLY
0
Entering edit mode
12.2 years ago
Free Man ▴ 180

I just found that: http://dgd.genouest.org/ provides duplicated genes by groups. http://goods.ibms.sinica.edu.tw/DNVs/download.html identifies over 10% of human genes associated with duplicated gene loci (DGL).

ADD COMMENT

Login before adding your answer.

Traffic: 1877 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6