Hi everybody,
All we know that blastx is so much time-consuming for transcriptome annotation derived from de novo assembly. Instead of using blastx, we may use transcript decoder to translate the assembled transcripts and then use blastp, which is more efficient than blastx due to less time and give straightforward results, but how does it well work for some organisms that has almost noting sequencing information on public databases like ncbi, uniprot,...?
Please let me know your idea about this and share your probable experience about comparison of two method to efficiently annotate transcriptome? Thanks for your comments and experience.
Thanks friend. That's right, however I think that blastp is more faster than blastx as blastx works on 6 frame. Your experience about blastall is interesting because I read many note that blast+ is faster! please let me know how many query is small in your view, for example 40,000-50,000 is ok?
I read this when I was working on a project in 2013. I am unable to find the source, unfortunately, but IIRC, sequences shorter than 10kb were processed faster by blastall's blastp. I'd suggest you try both for a batch of 5000 sequences and decide which is faster.
And yes, split your queries into batches of 1000 (really fast, use for testing) or 5000 (takes quite longer) - These will definitely run to completion in under 48 hours.
I'm not sure what you mean by "blastall is definitely a lot faster than blastx.". Blastall requires that you select a program.
I meant that ncbi-blast suite was faster than ncbi-blast+ suite for the length of my transcripts.
Specifically for BLASTX? Were these longer or shorter transcripts? Just curious.