Question

Getting scomname ssciname from BLAST+ BLASTP

0

Entering edit mode

15 months ago

fafad046 • 0

I am trying to run BLASTP using the command below. It is the blast-2.14.0 version, running on remote server with CPU Intel(R) Xeon(R) Gold 6248R CPU @ 3.00GHz, 24 cores 48 threads, OS RHEL 8 x86_64.

/home/choh1/ncbi-blast-2.14.0+/bin/blastp \
-query ${MANE_protein_id}_.fasta \
-db /home/choh1/ncbi-blast-2.14.0+/refseq_protein_db/refseq_protein \
-out ${MANE_protein_id}_orthologs.txt \
-seqidlist /home/choh1/ncbi-blast-2.14.0+/refseq_protein_db/primates_acc_alias_blastdb.txt \
-outfmt "6 std staxids scomname ssciname"

I'm running the same command on 2 different directories on 2 different protein_ids, and the expected output is for both output files to have this format, where the last 2 columns should have the scientific name and common name.

NP_001002296.1  XP_047573126.1  100.000 137     0       0       1       137     12      148     1.76e-97        284     9657    Eurasian river otter    Lutra lutra \
NP_001002296.1  XP_010956270.1  99.270  137     1       0       1       137     1       137     3.76e-97        282     9837;9838;419612        Bactrian camel  Camelus bactrianus

However, I am only getting scientific name and common name from the BLASTP of one directory, and not the other, where they were only N/A in the columns, like below

NP_001034707.1  NP_001034707.1  100.000 354     0       0       1       354     1       354     0.0     698     9595;9606       N/A     N/A \
NP_001034707.1  XP_006718705.1  99.718  354     1       0       1       354     1       354     0.0     696     9606    N/A     N/A

Does anyone happen to know how to make sure that they always have species names in the last 2 columns please? Thank you in advance!

blastp blast • 781 views

ADD COMMENT • link updated 15 months ago by GenoMax 147k • written 15 months ago by fafad046 • 0

0

Entering edit mode

That is odd. NP_001034707 appears to be a human protein so you should be able to get the info you are looking for. I assume you have the taxID blast database downloaded and available in $BLASTDB folder?

ADD REPLY • link 15 months ago by GenoMax 147k

0

Entering edit mode

Hi yes, just to be sure I have the files

taxdb.btd
taxdb.bti

in the /home/choh1/ncbi-blast-2.14.0+/refseq_protein_db/refseq_protein directory, is that correct?