Alternatives to ProtExcluder in repeat annotation/MAKER?
0
0
Entering edit mode
21 months ago
evoecogen ▴ 30

Hi,

ProtExcluder.pl is a classic pipeline used to exclude homologs of annotated proteins from a library of interspersed repeat elements, prior to repeat annotation. The goal is to avoid overmasking CDS in the target genome (misinterpreting genes as parts of interspersed repeats).

Are there any other tools for the same purpose?

Unfortunately I have had issues with deploying this pipeline for days (https://github.com/NBISweden/ProtExcluder/issues), and it eliminates 2/3 of my repeat library.

repeats esl-fetch annotation ProtExcluder MAKER • 1.1k views
ADD COMMENT
1
Entering edit mode

What species are you working on?

ADD REPLY
0
Entering edit mode

It is a non-model rodent (https://www.ncbi.nlm.nih.gov/assembly/GCA_026167925.1).

Everything else that you described in your post works just fine (using your conda env), but then it fails at the ProtExcluder stage. I tried just going script by script, something weird happens with esl-fetch (which I have installed system-wide).

ADD REPLY
0
Entering edit mode

From what you say it does not fail, you just have too much removed compared to what you would expect. Maybe some particularity of your studied species

ADD REPLY
0
Entering edit mode

Doubt it. One, it is a rodent, nothing too peculiar about it. Two, as I said, there is a very specific problem with esl-fetch (its output file contains a bunch of error messages, rather than the kind of output one would expect). Anyway, to answer my own Q, I found a viable alternative: https://blaxter-lab-documentation.readthedocs.io/en/latest/filter-repeatmodeler-library.html

ADD REPLY

Login before adding your answer.

Traffic: 2377 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6