Entering edit mode
5.9 years ago
lagartija
▴
160
Hello everyone,
I have a question that might interest people making their own blast protein database. I see there is an option to mask low complexity regions with segmasker before (or after) making a database. How useful is that for protein ? I see why it can really reduce false positives in nucleotides databases but what about proteins ? Are low complexity regions really an issue ?
Thank you very much for your judgment, Cheers,
They are not really an "issue" and by default the filtering when doing blastp is actually turned off. In some cases it might prove useful though (example: very repetitive sequences).
In general I would not advise to mask/filter the DBs when formatting them, if not for the simple reason then when you then want to do an unfiltered blast you'll need to remake your DB. In any case you can always turn filtering on when doing the actual blast itself