Repeat masked vs Repeat unmasked genome assembly
1
0
Entering edit mode
2.6 years ago

Hi everyone Can you please assist me here, which assembly is advisable (between repeat masked and repeat unmasked assembly) to use when you want to perform BLAST searches against genome assembly ( searching for genes in a particular genome). MyCocosm provides repeat masked and repeat unmasked assembly, so I'm not sure which one to use, your help would be greatly appreciated.

BLAST • 900 views
ADD COMMENT
1
Entering edit mode
2.6 years ago

I would start with the masked version (given that it has been masked OK, that is accurate, not too much false positives, ...) , that will 'free up ' space in your blast hit list for the real meaningful hits (== the protein-coding ones for instance) . If you use the unmasked version a lot of hits against TEs will be reported.

If you notice when using the masked version that you miss obvious hits, you can switch to the unmasked one.

ADD COMMENT
0
Entering edit mode

Alright, thank you so much for this information.

ADD REPLY
1
Entering edit mode

To add on, you can also try searching for pseudogenes (from the annotation file i.e the gff file) and mask the pseudogenes to get more specific hits.

ADD REPLY

Login before adding your answer.

Traffic: 1822 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6