masking low complexity regions in a protein database for BLAST

0

Entering edit mode

6.6 years ago

lagartija ▴ 160

Hello everyone,

I have a question that might interest people making their own blast protein database. I see there is an option to mask low complexity regions with segmasker before (or after) making a database. How useful is that for protein ? I see why it can really reduce false positives in nucleotides databases but what about proteins ? Are low complexity regions really an issue ?

Thank you very much for your judgment, Cheers,

blast • 1.8k views

ADD COMMENT • link 6.6 years ago by lagartija ▴ 160

0

Entering edit mode

They are not really an "issue" and by default the filtering when doing blastp is actually turned off. In some cases it might prove useful though (example: very repetitive sequences).

In general I would not advise to mask/filter the DBs when formatting them, if not for the simple reason then when you then want to do an unfiltered blast you'll need to remake your DB. In any case you can always turn filtering on when doing the actual blast itself

ADD REPLY • link 6.6 years ago by lieven.sterck 15k

Login before adding your answer.