Entering edit mode
2.8 years ago
Nicolas
•
0
Hello!
I am trying to find a promoter inside a genome ensamble, I have the promoter sequence
GGTTGTNNNNNNNNNACAACC
Whenever I try to find this sequences with the following code:
blastn -query=Prv211.fasta -subject=mtbgenome.fna -task=blastn-short -out=output.txt -evalue=10
, I have no matchs:
Database: User specified sequence set (Input: mtbgenome.fna).
1 sequences; 4,411,532 total letters
Query=
Length=21
***** No hits found *****
Lambda K H
1.37 0.711 1.31
Gapped
Lambda K H
1.37 0.711 1.31
Effective search space used: 35292152
Database: User specified sequence set (Input: mtbgenome.fna).
Posted date: Unknown
Number of letters in database: 4,411,532
Number of sequences in database: 1
Matrix: blastn matrix 1 -3
Gap Penalties: Existence: 5, Extension: 2
On the other hand, when I use the actual sequence of the promoter (with the actual bases instead of the Ns, it does find the promoter without problems)
Anyone knows why is this happening?
Thanks!
Edit = I increased evalue to 10, still no results
my feeling is that for blast to work properly you need to have a sufficient length of matching sequences. in your case perhaps try playing with the seed lengths and other parameters like that.
In general if you are looking for a pattern, you could try another tool that matches patterns rather than performs alignments.
https://bioinf.shenwei.me/seqkit/usage/#locate