How to annotate LTR-retrotransposons at lineage level using LTR-digest gff output based on hmms domains
1
0
Entering edit mode
6.4 years ago
gabri ▴ 60

Hi All,

I have run LTR-harvest and LTR-digest on my assembly. Specifically, I have used the -hmms option of LTR-digest and the complete GyDB database of hmms for the annotation of the identified elements. Now I have a gff output. The following data are specific of a single LTR-retrotransposon.

As you can see from the annotated domains, this is probably a Copia-related element. Nevertheless, I'd like to annotate every element at lineage level (sire, hydra, retrofit, oryco or whatever) but this kind of mixed annotation of the domains is not very helpful from this point of view. Did someone else have already faced this problem? Do you have an idea how to associate only a specific lineage to a specific element? Should I do a preliminary filtering based on the e-value (6th column) and the domain's length or something else?

Thank you very much for any advise

seq13 LTRharvest repeat_region 95640 101118 . - . ID=repeat_region2

seq13 LTRharvest target_site_duplication 95640 95644 . - . Parent=repeat_region2

seq13 LTRharvest inverted_repeat 95645 95646 . - . Parent=repeat_region2

seq13 LTRharvest LTR_retrotransposon 95645 101113 . - . ID=LTR_retrotransposon2

seq13 LTRharvest long_terminal_repeat 95645 96356 . - . Parent=LTR_retrotransposon2

seq13 LTRdigest protein_match 96438 96816 2.80E-19 - . name=RNaseH_pseudovirus

seq13 LTRdigest protein_match 96438 96834 6.90E-35 - . name=RNaseH_hydra

seq13 LTRdigest protein_match 96438 96837 2.20E-42 - . name=RNaseH_copia

seq13 LTRdigest protein_match 96438 96837 0 - . name=RNaseH_oryco

seq13 LTRdigest protein_match 96438 96837 2.20E-41 - . name=RNaseH_retrofit

seq13 LTRdigest protein_match 96438 96837 3.41E-43 - . name=RNaseH_pCretro

seq13 LTRdigest protein_match 96438 96837 0 - . name=RNaseH_sire

seq13 LTRdigest protein_match 96438 96840 0 - . name=RNaseH_tork

seq13 LTRdigest protein_match 96489 96759 1.50E-06 - . name=RNaseH_codi_II

seq13 LTRdigest protein_match 97143 97848 0 - . name=RT_copia

seq13 LTRdigest protein_match 97143 97878 0 - . name=RT_pCretro

seq13 LTRdigest protein_match 97143 97878 0 - . name=RT_hydra

seq13 LTRdigest protein_match 97143 97878 0 - . name=RT_sire

seq13 LTRdigest protein_match 97143 97878 0 - . name=RT_tork

seq13 LTRdigest protein_match 97167 97878 4.80E-41 - . name=RT_pseudovirus

seq13 LTRdigest protein_match 97218 97878 0 - . name=RT_oryco

seq13 LTRdigest protein_match 97227 97878 0 - . name=RT_retrofit

seq13 LTRdigest protein_match 98442 99084 0 - . name=INT_tork

seq13 LTRdigest protein_match 98460 98874 6.60E-30 - . name=INT_pCretro

seq13 LTRdigest protein_match 98472 99084 0 - . name=INT_copia

seq13 LTRdigest protein_match 98475 99084 0 - . name=INT_retrofit

seq13 LTRdigest protein_match 98484 99084 0 - . name=INT_oryco

seq13 LTRdigest protein_match 98511 98730 4.30E-12 - . name=INT_hydra

seq13 LTRdigest protein_match 98511 98997 4.00E-25 - . name=INT_sire

seq13 LTRdigest protein_match 98532 98880 1.30E-11 - . name=INT_pseudovirus

seq13 LTRdigest protein_match 99231 99474 0 - . name=AP_tork

seq13 LTRdigest protein_match 99240 99474 2.70E-06 - . name=AP_oryco

seq13 LTRdigest protein_match 99246 99474 1.30E-07 - . name=AP_retrofit

seq13 LTRdigest protein_match 99609 99957 2.90E-08 - . name=GAG_copia

seq13 LTRdigest protein_match 99609 100329 0 - . name=GAG_tork

seq13 LTRharvest long_terminal_repeat 100402 101113 . - . Parent=LTR_retrotransposon2

seq13 LTRharvest inverted_repeat 96355 96356 . - . Parent=repeat_region2

seq13 LTRharvest inverted_repeat 100402 100403 . - . Parent=repeat_region2

seq13 LTRharvest inverted_repeat 101112 101113 . - . Parent=repeat_region2

seq13 LTRharvest target_site_duplication 101114 101118 . - . Parent=repeat_region2

LTR-retrotransposons lineage LTR-digest • 1.7k views
ADD COMMENT
0
Entering edit mode
5.4 years ago
olo0002 • 0

HI Were you able to sort this out? if yes, can you tell me how to go about it?.

thanks

ADD COMMENT

Login before adding your answer.

Traffic: 2574 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6