Hello everyone!
I’ve downloaded a file from the Ensembl Biomart here : https://www.ensembl.org/biomart/martview/4cab87e631cda01b9def14f09cc2021f
The chosen database is: “Ensembl Genes 95” and the chosen dataset is “Human Genes (GRCh38.p12)” The chosen attributes are:
- Gene stable ID
- Transcript stable ID
- Protein stable ID
- hmmpanther ID
- hmmpanther start
- hmmpanther end
I’ve access to a start and end position for each Panther ID, same goes with Intepro when I download the last release (protein2ipr.dat.gz) https://www.ebi.ac.uk/interpro/download.html.
My problem is that I can’t seem to find this information in the files provided by Panther DB (http://pantherdb.org/downloads/index.jsp)
Is someone knows where I can find information about the start and end position for a Panther ID?
Thank you for any suggestions!!!
Are you looking to find the start/stop of where the panther domains map on the protein or which part of the panther domain has been mapped to the protein?
I'm looking to find the start/stop of where the panther domains map on the protein
Did you not mention you have that?
What am I overlooking?
I would like to retrieve this information from the Panther database and not a another source in order to assure the validity of the information. I'm sorry if it wasn't clear!
The pantherDB does not work like that. It assigns genes to families from which then HMM are build that can be used to screen 'new' sequences. So a sequence is either part of the certain pantherID or not (== the assignment is always on a complete gene basis).
For a certain pantherID you can look up though which genes are assigned to it. eg. PTHR18929
Okey thank you for your help !