conducting individual blastp for multiple protein sequences
0
0
Entering edit mode
3.0 years ago
rb77 • 0

Hello,

I have to blast multiple protein sequences from a given species (in a mulfasta file) against the human protein database, and the goal is to find the corresponding closest homolog for each protein sequence.

I'm wondering if there is a way to automate this process? of running individual blastp queries for each protein sequence against the whole human protein db and then grabbing top hits of each query? thank you and would appreciate any advice on this.

blastp multiple blast protein • 1.3k views
ADD COMMENT
0
Entering edit mode

I would blast the whole multifasta file to the DB and grep afterwards, otherwise you will create a substantial amount of "overhead", loading the DB into memory each time etc ...

ADD REPLY
0
Entering edit mode

when i try to blast the whole multifasta file to the DB it says

"Your total query length is greater than allowed on the BLAST webserver. You can either reduce the size to 100,000 or less and try again or run stand-alone <@STANDALONE_DOC@> or our <@STANDALONE_DOC_CLOUD@>."

also, I need the top hit for each protein sequence in the fasta file.. so im not sure if blasting the whole multifasta file will work..

ADD REPLY
0
Entering edit mode

Sounds like you are doing this at NCBI remotely. Perhaps split your multi-fasta file into pieces and try. If you have thousands of sequences then blast public resource is not meant to support that kind of application.

While not advisable you could select only 1 (ideally NCBI recommends 5 since the first hit is not guaranteed to be the best) "hit" per query.

ADD REPLY
1
Entering edit mode

Furthermore, a simple BLAST is insufficient to establish homology. There are dedicated tools for this, some of which are based on blast.

I would recommend a literature search

ADD REPLY

Login before adding your answer.

Traffic: 1931 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6