I'm able to find genes containing a given domain with Biomart. However, I need to know the start and ending site of the specific domain in those genes. How can I do that?
Lots of thanks!!
I'm able to find genes containing a given domain with Biomart. However, I need to know the start and ending site of the specific domain in those genes. How can I do that?
Lots of thanks!!
You can get this information using BioMart interface of InterPro. Use the following attributes to retrieve protein domain specific information.
Match Status
Match Start Position
Match Stop Position
Match Score
Try a sample query here and check the screenshot of a sample query here. If you are new to InterPro, more about InterPro is available here.
You have two options here. 1. Use an ID mapping service to map between Uniprot and Ensembl identifiers. 2. Use Ensembl API to retrieve to get the domain information. See the tutorial http://useast.ensembl.org/info/docs/api/core/core_tutorial.html and Protein Features section http://useast.ensembl.org/info/docs/api/core/core_tutorial.html#translations
Depending on what domain database the hit came from, you could go to the web page of the domain database and see if they offer the data. Many of them do, e.g. Pfam. However, it is possible that the domain databases offer coordinates relative to the uniprot entries, not to the Ensembl proteins. Depending on what your plans are, this might be a problem. However, Biomart offers uniprot links, which are useful to make the connection.
I disagree with one aspect of this answer. Ensembl is providing domain architecture information, including start and end information for specific domains. For example see this particular gene: http://useast.ensembl.org/Homo_sapiens/Transcript/ProteinSummary?db=core;g=ENSG00000203710;r=1:207669492-207813992;t=ENST00000400960 If you mouse-over a particular Pfam / SMART / InterPro domain, you can see the start and stop information of a particular domain.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
which Biomart API?