Entering edit mode
4 months ago
Md Moinuddin
•
0
Hello Community,
I have a fasta file containing protein sequences from different viruses. I want to run LAST against NCBI nr database to assign taxonomy for these viruses using 95% AAI.
I am using Linux bash on HPC.
Can anyone help me with steps that I should follow. Thanks in advance
Before running the following command make sure that
ncbi-blast+
has been installed in your system. For each protein sequence, you can run something like this:blastp -query input_proteins.fasta -db nr -remote -out results.out
Thanks for your response! If I run this command, shall it consider one protein sequence at a time? In the fasta file I have hundreds of protein sequences from different viruses.
I wanted to assign taxonomy based on the blastp
I would run each protein fasta in a loop or a parallel mode.
Remote blast is not supposed to be used for such an application. You may start getting errors and at worse may be IP banned for misusing the service.
Consider getting all known RefSeq viral genomes and doing the search locally: https://ftp.ncbi.nlm.nih.gov/refseq/release/viral/ This should not be a large database.
Are you referring to https://gitlab.com/mcfrith/last as the
LAST
program or something else?Yes. I am referring to that