I have been looking to BioJava class that allows search to protein information such as its id, organism name and ect when entering the protein sequence (e.g. HELHYNILLCGNLCLPLQDFRAQIIKYVFMHSRKDINWMN
). The class UniprotProxySequenceReader
does the opposite ( you need to enter the uniportID). Is there any way to search by the protein sequence?
Are you asking do sequence similarity search tools exist?
No, I am asking to find organism name.
Thanks
Following sequence similarity search, you don't get organism name straight from the header or by mapping the ID to organism name?
So, how can I get the organism name(s) for the protein sequence string.
Basically you either parse it straight from your results (i.e. if species name is in the reference db headers) or then you map it from the identifiers of your reference db headers.