Question

Blast command line gives no hits

0

Entering edit mode

4.0 years ago

Vca80553 • 0

Hi everyone,

I am using the BLASTn program (command line) to analyze alignment of different nucleotide sequences against a custom database (containing around 450 fasta sequences). For most of the query sequences I had good hits, and for some of the sequences I did not get any hits to my database. I have analyzed manually those that did not give any hits and some of them should, as I have done the blastn manually and get the hits.

To simplify and look for possible reasons, I took one of the fasta sequences (sequence1.fasta) that did not give me any hit when querying it against the custom database with command line and did the blast through the web (blastn with default parameters) using the same database (by aligning 2 or more nucleotide sequences).

This is the result

Description   Max Score Total Score Query Cover E value Per. Ident Accession
HPV226            222            222    100%         3e-61     84.50%     Query_49167

Range 1: 5419 to 5618GraphicsNext MatchPrevious Match
Alignment statistics for match #1
Score   Expect  Identities  Gaps    Strand
222 bits(245)   3e-61   169/200(85%)    0/200(0%)   Plus/Plus
Query  1     ACACTGAAAATCCTGCTAATTATCAAAAAgggggggCTAAGGACACTCGTCAGAATGTGT  60
             |||| ||||||||||||   ||||||||||||||||| |||||||||||||| ||||| |
Sbjct  5419  ACACAGAAAATCCTGCTGCATATCAAAAAGGGGGGGCAAAGGACACTCGTCAAAATGTAT  5478

Query  61    CCCTGGATCCCAAACAAACTCAGTTGTTTGTTGTAGGCTGTACCCCTTGTAAGGGTGAGC  120
             |  ||||||| |||||||| ||||| |||||||| || ||||||||||| |||||||| |
Sbjct  5479  CTTTGGATCCTAAACAAACCCAGTTATTTGTTGTGGGGTGTACCCCTTGCAAGGGTGAAC  5538

Query  121   ATTGGGATGTTGCTACTGCTTGTTCCAGGCTTAACAAGGGTGATTGCCCTCCTATACAGC  180
             |||||||||| ||  |||| ||||| | ||||  ||| || || ||||||||||| ||||
Sbjct  5539  ATTGGGATGTGGCCCCTGCCTGTTCTAAGCTTGGCAAAGGGGACTGCCCTCCTATTCAGC  5598

Query  181   TTGTGCCTTCTGTAATTGAG  200
             ||||| | ||||| ||||||
Sbjct  5599  TTGTGTCCTCTGTTATTGAG  5618

So I see that my query is very similar to HPV 226 sequence from my custom database.

I create a database with only that sequence (HPV226), and use my query again to blast it against the HPV226 with command line and I get no hit.

blastn -db HPV226.fasta -query gi1185315504gbKY063012.1HumanpapillomavirusisolateCT14majorcapsidproteinL1genepartialcds.fasta  -out result.out

As I saw the small letters when doing blastn via web, I thought that it could be due to some masking, and I have tried -dust no and -soft_masking false, but still dont get the hit. Any idea what I am missing here? I have read through the forum and did not get my answer :( Thanks a lot!

alignment blast • 2.7k views

ADD COMMENT • link updated 4.0 years ago by lieven.sterck 15k • written 4.0 years ago by Vca80553 • 0

0

Entering edit mode

Your query sequence above produces this single perfect hit at NCBI.

Human papillomavirus isolate CT14 major capsid protein (L1) gene, partial cds
Sequence ID: KY063012.1Length: 200Number of Matches: 1

Score   Expect  Identities  Gaps    Strand
370 bits(200)   2e-98   200/200(100%)   0/200(0%)   Plus/Plus

Query  1    ACACTGAAAATCCTGCTAATTATCAAAAAgggggggCTAAGGACACTCGTCAGAATGTGT  60
            ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  1    ACACTGAAAATCCTGCTAATTATCAAAAAGGGGGGGCTAAGGACACTCGTCAGAATGTGT  60

Query  61   CCCTGGATCCCAAACAAACTCAGTTGTTTGTTGTAGGCTGTACCCCTTGTAAGGGTGAGC  120
            ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  61   CCCTGGATCCCAAACAAACTCAGTTGTTTGTTGTAGGCTGTACCCCTTGTAAGGGTGAGC  120

Query  121  ATTGGGATGTTGCTACTGCTTGTTCCAGGCTTAACAAGGGTGATTGCCCTCCTATACAGC  180
            ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  121  ATTGGGATGTTGCTACTGCTTGTTCCAGGCTTAACAAGGGTGATTGCCCTCCTATACAGC  180

Query  181  TTGTGCCTTCTGTAATTGAG  200
            ||||||||||||||||||||
Sbjct  181  TTGTGCCTTCTGTAATTGAG  200

ADD REPLY • link 4.0 years ago by GenoMax 147k

0

Entering edit mode

Yes, that is my query sequence (Human papillomavirus isolate CT14 major capsid protein (L1) gene, partial cds), and I am blasting it against a database that has only complete HPV genome sequences, no partial cds. It is most related to HPV 226. Why dont I get the hit with blast when doing command line then?

ADD REPLY • link 4.0 years ago by Vca80553 • 0

0

Entering edit mode

can you give it a try by simply blasting those two sequence to each other, using the bl2seq approach?

blastn -query <fasta file1> -subject <fasta file 2>

likely not related (let alone the cause of all this) but that is a severely long fasta file name you have there. Out of curiosity, can you also try when using a much shorter filename?

ADD REPLY • link 4.0 years ago by lieven.sterck 15k

0

Entering edit mode

I changed the fasta file names and made it shorter. Still no hits :(

Code: blastn -query HPVisolateCT14.fasta -subject HPV226.fasta

Result no hits: Reference: Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb Miller (2000), "A greedy algorithm for aligning DNA sequences", J Comput Biol 2000; 7(1-2):203-14.

    Database: User specified sequence set (Input: HPV226.fasta).
           1 sequences; 7,313 total letters

   Query= gi1185315504HumanpapillomavirusisolateCT14
    Length=200


***** No hits found *****



Lambda      K        H
    1.33    0.621     1.12

Gapped
Lambda      K        H
    1.28    0.460    0.850

Effective search space used: 1365100


  Database: User specified sequence set (Input: HPV226.fasta).
    Posted date:  Unknown
  Number of letters in database: 7,313
  Number of sequences in database:  1



Matrix: blastn matrix 1 -2
Gap Penalties: Existence: 0, Extension: 2.5

ADD REPLY • link 4.0 years ago by Vca80553 • 0

1

Entering edit mode

hmm, ok , I see (tried the same and got indeed the same output :/ )

a bit of trial and error later: try the command again but this time add -task blastn to it ... this will invoke the classic blastn approach rather than the megablast one that is default nowadays when calling blastn

ADD REPLY • link 4.0 years ago by lieven.sterck 15k

0

Entering edit mode

It worked!! Thank you very much! Really thankful for this! Best

ADD REPLY • link 4.0 years ago by Vca80553 • 0

score 2 · Answer 1 · 2020-11-13

this behaviour is caused by the difference between 'normal' blastn and megablastn.

When running a local blast the default for blastn is megablast (while for the web version it is classic blastn). Why you don't see the result when using megablast is that the alignment (see the original post) just has to many unmatched bases, or at least no stretches of perfect matches long enough, for megablast to pick this up (megablast settings are much more strict than the normal blastn ones).

adding the -task blastn option to your commandline will give you the desired/expected output.