Hello All,
I am trying to obtain all the species names from the blast output file which contains the accession numbers. I tried to use efetch
command from NCBI edirect.
The command is: efetch -db taxonomy -id CP00001
but it gives an error saying 500 Can't connect to proxy
. I have already set "https_proxy" as an environment variable in my bash.rc file. But I am facing the same error.
How should I fix this ?
Is there a way in which I can get the species names directly when running blast ? I tried to run it with taxdb but it gave me all N/As in the species column.
Thanks for your help.
Which version of NCBI edirect utils are you using? If you are using preformatted BLAST databases or have used taxdb when building custom BLAST databases then you may be able to use
sscinames
in tabular blast output format. See Table C1 for more info https://www.ncbi.nlm.nih.gov/books/NBK279684/Thanks for your reply. I am using preformatted "NT database" from NCBI. This was already existing. I downloaded edirect from the command line options given here https://www.ncbi.nlm.nih.gov/books/NBK179288/
If you are using the preformatted NCBI BLAST
nt
databases then you can specify-outfmt 6 'qaccver saccver pident length mismatch gapopen qstart qend sstart send evalue bitscore sscinames'
parameter to extract scientific names in the command line BLAST.I used the same code previously. But it only gives "N/A" in the sscinames column.
Is the taxonomy database present in same directory as your NT files?
Yes. It is in the same directory.
can you please specify the complete command ?
I don't have any proxy set up and I can connect without problems with efetch. Do you really need to set-up "https_proxy"?
i tried the same command with the
-id CP012354
but it gives me a wrong species name. It gives "Enterobacteria phage ST viruses" where as it should be "Bacteria Actinobacteria Propionibacteriales"