Hi,
I'm supposed to do reciprocal best hit using blastp for a long list of proteins for which I'm given only the UniProtKB/TrEMBL identifier. I have to find the ortholog proteins for each of the given proteins. I also have to find orthologs in a given list of species.
Previously, when I had only one Protein, I would do it through online tool in NCBI, there I can give the identifier and also limit the species. However, with more proteins in hand, this is not feasible anymore.
I downloaded Blast, but I cannot download any databases (memory problem, and I need refseq_protein database).
I'm familiar with R and would like to do it in R, but sending online queries to the NCBI blastp.
I searched a lot and was not able to find hint.
I installed orthologr, biomaRt and some other packages, but they only used local databases.
I would appreciate it a lot if somebody could help me how I can send online query through R. I have an R code and somewhere in the code I need to call blastp to find orthologs of my proteins among the given list of species.
Thanks and looking forward.
Maah
Thanks a lot for your response.
Is it also possible to give protein sequence identifier instead of fasta file? In the web tool this is possible, like if you enter:
Is there anyway to specify the species as well?
I think it could work through the
-entrez_query
parameter but I haven't tried.I figured this out, no worries for this.
Would you mind please send me the link for python?
I thought you know a good manual on how to do blast in biopython and you could maybe share it with me, just to save time.
Otherwise I know how to search it :)
google:biopython