Dear users,
Apologies for asking such a broad and somewhat obvious question to some, but I am wanting to annotate a new transcriptome across several databases (inc. NCBI, Swissprot, Uniprot, custom databases) from my standalone BlastX server set-up on a cluster.
Whilst I am comfortable blasting against 1 database using BlastX. How can I go about annotating an assembly via multiple databases and collating all the information and the best suited hits? Can this be run in one process or do I need specialised scripts to combine and sort out all the information collated?
Answer may be obvious to many, but I am still learning more on command line and scripting.
Thanks for the help!
Thanks Michael
So I can combine all databases using the blastdb_aliastool function of BLAST+ on Linux command, then start the blastX process from there. Assuming I do it this way, the best hit will come from the combined dataset, correct? Just trying to understand the process in my head.
What is the difference between
aliastool
and the-db
function?I only have 4 databases max, so I guess aliastool would be best for me?
Apologies for such simple questions.
Hello Michael Dondrup, how big is the size of the nr database? I'm trying to blast against SwissProt database only but from your answer above also have to download the nr database.