Hello,
If I want to apply the BlastP -db_soft_mask
flag using BLAST+ v 2.10.1, am I correct in thinking that I will need to build a masked database first (i.e. I can't apply the flag to a standard DB made using makeblastdb)? Also, this link says
Database masking has two modes. The first is known as "soft-masking", and BLAST uses the database mask only during the (initial) word-finding phase of BLAST. The second is known as "hard-masking", and BLAST uses the database mask during all phases of the search.
To clarify, does this mean that when a hit is found in the soft-masked database, the hit sequence is 'demasked' and analysed for HSPs in full?
Thanks!
You do not need to do anything different when building a BLAST database - masking is done on the fly.
Thanks for your reply! The flag needs a string argument from what I can tell:
I think the string is the masking algorithm ID that accompanies a masked database (if you make one), from here:
The link is explicitly using a masked database that was made here.
What would the string be if you didn't make a masked DB?
BLAST used to have
dust
andseg
algorithms for masking DNA and protein sequences, respectively. For those two there was no need to change anything during the database creation process. Now it seems that-dust
and-db_soft_mask
are separate options. My apologies, I should have looked into this more carefully before commenting.No worries - I think dust might not be included in the BLASTP flags now, although seg still is (for queries only). Dust is still included as a program in the BLAST+ package though, potentially supporting it's intended use being in masked database creation before the run!