Remote BLASTx : Strain specific search.
0
0
Entering edit mode
12 weeks ago
Saransh • 0

Hi community, Greetings!

I am using remote BLASTx on my Linux machine for some bacterial species (Few Nucleotide sequences in FASTA file format). Following is the command I'm using: /home/ncbi-blast-2.16.O+/bin/blastx - remote -query filtered_regions.fasta -db nr -entrez_query "E.coli [organism]" -out k12_blastx_res.txt

And it is providing me let's say 30 regions with No hits. But when I put specific strain (eg. -entrez_query "E.coli K-12 [organism]"), the number of regions with No hits decreases drastically (around 12-15).

Can anyone explain why is it happening? Because logically when using general taxon (E.coli) it should give low number of No hits as it must cover other strains as well, and when specifying the strain (E.coli K-12) it should increase the number of No hits as it must ignore other strains. But experiencing completely opposite.

Your help will be appreciated.

Thank you.

Strain_specific_search BLASTx BLAST • 529 views
ADD COMMENT
1
Entering edit mode

How about using the online service, where you can specify the organism? https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastx&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome

ADD REPLY
0
Entering edit mode

Hi, thanks for your response, actually in the beginning I was using online service, but according to the project's need I had to use BLAST that can be automated (containerized) within the pipeline. That's why I switched to Linux based remote BLAST.

ADD REPLY
0
Entering edit mode

And it is providing me let's say 30 regions with No hits.

What does this mean? 30 sequences from your query show no hits?

ADD REPLY
0
Entering edit mode

Yes, If I had 100 regions in my query file, out of which 70 regions get hits (Hypothetical/non-redundant proteins) and 30 regions gets No Hits (i.e Neutral regions: that means they don't code for any kinda protein).

According to my logic: If I'm providing whole organism as Entrez query, It should cover all the strains of that organism and should give less number of neutral regions (with No hits) and if I provide strain in entrez query, It should ignore the other strains and give higher number of neutral regions.

But I'm getting opposite of it.

And E coli is just an example, it's happening with other organisms as well.

ADD REPLY
0
Entering edit mode

Entrez query, It should cover all the strains of that organism

Try using taxID for the organism since that may be a better filter. You should also look at default values for other parameters (word length etc) to make sure they are optimal for what you are trying to do. If you don't change a value then the default values are always in use. People tend to forget that at times.

ADD REPLY
0
Entering edit mode

I tried with taxid earlier but was facing same issue. I'll surely look into setting up the parameters.Thanks for your suggestion.

ADD REPLY

Login before adding your answer.

Traffic: 1973 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6