Question

Pirate transposable elements pipeline

1

Entering edit mode

5.6 years ago

Chironex ▴ 50

hello, I'm doing an analysis of my unassembled genome of a specie of cephalopoda. I'm using repeatmasker but I'm interested to try others pipelines. I read something about Pirate here https://www.seanoe.org/data/00406/51795/. Does anyone know or has used it? Do you suggest me to use it? Thank you

transposable elements • 1.7k views

ADD COMMENT • link updated 5.5 years ago by berthelier.j ▴ 20 • written 5.6 years ago by Chironex ▴ 50

score 3 · Answer 1 · 2019-05-04

Hi frida,

Practically, from Fig1 in the PiRATE paper, you'll need an assembly in addition to your unassembled reads. What's also critical is that the whole pipeline is wrapped in a Galaxy bundled in the VM file, which is hard to be implemented sometimes (for example our HPC cluster does not support VM... but last time when discussing with the author, a very nice guy by the way, he was thinking switching to Docker, you might have to contact him if this is an issue for you). Once the VM is running up, be careful that there might be issues with keyboard setting (the default was in French if I remember well...)

Depending on your need, let's say if you want to characterize the TE elements from unassembled reads of your species or make a consensus library, options are:

dnaPipeTE (also implemented in PiRATE)
RepeatExplorer (also implemented in PiRATE, can be run on their public Galaxy server)
RepArk (also implemented in PiRATE)
REPdenovo (can generate longer repeats)

I would give the first two a try, and I refer you to this comprehensive review (mentioning more tools using raw reads): Goerner-Potvin, P., and Bourque, G. (2018). Computational tools to unmask transposable elements. Nat Rev Genet 19, 688-704.

Hope it helps. :-)

score 2 · Answer 2 · 2019-06-03

Hi frida, Hi SMK,

As SMK explained to you PiRATE has to be used through a Linux Virtual Machine (named PiRATE-VM). In the VM, every tools are installed (repeatscout, REPET ...) and automated in a local Galaxy (named PiRATE-Galaxy). The PiRATE-Galaxy is a kind of platform with various tools to detect, classify and annotate TEs.
PiRATE is flexible and you can use every tool once by once. Thus if you only have an assembled genome it is OK to use PiRATE (you will just not use the tools that use NGS data).

You can also use NGS data with some tools (RepeatExplorer, dnaPipeTEs and RepARK) as SMK explained. If you have any question you can leave us a message on the PiRATE Github: https://github.com/JBerthelier/PiRATE/issues

As SMK said a new version of PiRATE, which is working through Docker is planned to be build by my previous lab (where I develloped PiRATE).

Best,

Jérémy (author of PiRATE)