LTR insertion time
2
0
Entering edit mode
6.6 years ago
alim.hcu ▴ 20

Dear User,

Does anyone have some suggestion any script or any pipeline to calculate LTR insertion time. I think codeml or paml will not work here. I have already waste my so much time. please help me.

Thank You

genome • 4.1k views
ADD COMMENT
0
Entering edit mode

You will waste more time unless you explain your problem in detail. What kind of data do you have? What have you tried so far?

Have a look at this tutorial: Ageing LTR insertions. Does it help?

ADD REPLY
0
Entering edit mode

there is some additional info here : make fasta sequence which is multiple of three ;)

interesting link btw @h.mon

ADD REPLY
0
Entering edit mode

I have already trie and getting error

"LTR_pairwise_differences.py:44: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  seqsdiff['pair'] = seqsdiff['ltrone'] + seqsdiff['ltrtwo']"
ADD REPLY
1
Entering edit mode

That's a warning, not an error. It's generated by the pandas python module.

ADD REPLY
0
Entering edit mode

My Pipeline was 1) Pair wise alignment of LTR sequence using muscle 2) Now i have to determine Ks value for each pairwise alignment file.

ADD REPLY
0
Entering edit mode

Is the error you reported above related to the python script in the last part of the pipeline? more specifically this one?

I would also be tempted to go for Ks estimations (lot's of soft can do this) but this is probably not the best way forward as Ks is actually meant for protein sequence comparisons. As there is no selection pressure on the protein (there is obviously none for LTRs) the basic assumption for applying Ks is of course not valid. I would thus go for the 'difference' calculations described in the github page

ADD REPLY
0
Entering edit mode

The first thing you should understand is that SettingWithCopyWarning is a warning, and not an error. The real problem behind the warning is that it is generally difficult to predict whether a view or a copy is returned. In most cases, the warning was raised because you have chained two indexing operations together. The SettingWithCopyWarning was created to flag "chained assignment" operations. This is made easier to spot because you might be used [] (square brackets) twice, but the same would be true if you used other access methods such as .loc[] , .iloc[] and so on.

Moreover, you can change the behaviour of SettingWithCopyWarning warning using pd.options.mode.chained_assignment with three option "None/raise"/"warn".

ADD REPLY
0
Entering edit mode

One recently published paper estimate the insertion time by "DNA divergence between the sequence was estimated with the baseml program from PAML ver. 4.8 (Yang 2007) using the Kimura-2-parameter base substitution model" Since i am new in bioinfo specially in evolution. So i am not able to understand that, How he used baseml output for calculating LTR insertion time.

Thanks For your time guys.

ADD REPLY
2
Entering edit mode
4.8 years ago

The LTRpred pipeline now includes a rough dating of insertion time when de novo annotating functional and potentially mobile LTR retrotransposons.

In case you don't wish to de novo annotate an entire genome, but want to work with existing annotations, the function LTRpred::ltr_age_estimation() may be useful for this purpose.

In general, an example LTRpred run and the corresponding output looks as follows:

# de novo functional annotation of LTR retrotransposons in the Human Y chromosome
LTRpred::LTRpred(genome.file = system.file("Hsapiens_ChrY.fa", package = "LTRpred"))

The output table then includes the following information:

Observations: 21
Variables: 92
$ species                 <chr> "Hsapiens_ChrY", "Hsapien…
$ ID                      <chr> "Hsapiens_ChrY_LTR_retrot…
$ dfam_target_name        <chr> NA, NA, NA, NA, NA, NA, N…
$ ltr_similarity          <dbl> 80.73, 89.85, 79.71, 83.2…
$ ltr_age_mya             <dbl> 0.7936246, 0.2831139, 0.7…
$ similarity              <chr> "(80,82]", "(88,90]", "(7…
$ protein_domain          <chr> "RVT_1", "RVT_1", NA, NA,…
$ orfs                    <int> 1, 1, 0, 0, 0, 0, 0, 1, 0…
$ chromosome              <chr> "NC000024.10Homosa", "NC0…
$ start                   <int> 3143582, 3275798, 3313536…
$ end                     <int> 3162877, 3299928, 3318551…
$ strand                  <chr> "-", "-", "+", "+", "-", …
$ width                   <int> 19296, 24131, 5016, 12952…
$ annotation              <chr> "LTR_retrotransposon", "L…
$ pred_tool               <chr> "LTRpred", "LTRpred", "LT…
$ frame                   <chr> ".", ".", ".", ".", ".", …
$ score                   <chr> ".", ".", ".", ".", ".", …
$ lLTR_start              <int> 3143582, 3275798, 3313536…
$ lLTR_end                <int> 3143687, 3276408, 3313665…
$ lLTR_length             <int> 106, 611, 130, 126, 218, …
$ rLTR_start              <int> 3162769, 3299338, 3318414…
$ rLTR_end                <int> 3162877, 3299928, 3318551…
$ rLTR_length             <int> 109, 591, 138, 137, 219, …
$ lTSD_start              <int> 3143578, 3275794, 3313532…
$ lTSD_end                <int> 3143581, 3275797, 3313535…
$ lTSD_motif              <chr> "acag", "ttgt", "ttag", "…
$ rTSD_start              <int> 3162878, 3299929, 3318552…
$ rTSD_end                <int> 3162881, 3299932, 3318555…
$ rTSD_motif              <chr> "acag", "ttgt", "ttag", "…
$ PPT_start               <int> NA, NA, NA, NA, NA, 34660…
$ PPT_end                 <int> NA, NA, NA, NA, NA, 34660…
$ PPT_motif               <chr> NA, NA, NA, NA, NA, "agag…
$ PPT_strand              <chr> NA, NA, NA, NA, NA, "+", …
$ PPT_offset              <int> NA, NA, NA, NA, NA, 23, N…
$ PBS_start               <int> NA, NA, 3313667, 3372512,…
$ PBS_end                 <int> NA, NA, 3313677, 3372522,…
$ PBS_strand              <chr> NA, NA, "+", "+", "-", "+…
$ tRNA                    <chr> NA, NA, "Homo_sapiens_tRN…
$ tRNA_motif              <chr> NA, NA, "aattagctgga", "c…
$ PBS_offset              <int> NA, NA, 1, 3, 0, 5, 2, 5,…
$ tRNA_offset             <int> NA, NA, 1, 0, 2, 5, 1, 5,…
$ `PBS/tRNA_edist`        <int> NA, NA, 1, 1, 1, 1, 1, 1,…
$ orf.id                  <chr> "NC000024.10Homosa_314358…
$ repeat_region_length    <int> 19304, 24139, 5024, 12960…
$ PPT_length              <int> NA, NA, NA, NA, NA, 27, N…
$ PBS_length              <int> NA, NA, 11, 11, 11, 11, 1…
$ dfam_acc                <chr> NA, NA, NA, NA, NA, NA, N…
$ dfam_bits               <dbl> NA, NA, NA, NA, NA, NA, N…
$ dfam_e_value            <dbl> NA, NA, NA, NA, NA, NA, N…
$ dfam_bias               <dbl> NA, NA, NA, NA, NA, NA, N…
$ `dfam_hmm-st`           <dbl> NA, NA, NA, NA, NA, NA, N…
$ `dfam_hmm-en`           <dbl> NA, NA, NA, NA, NA, NA, N…
$ dfam_strand             <chr> NA, NA, NA, NA, NA, NA, N…
$ `dfam_ali-st`           <dbl> NA, NA, NA, NA, NA, NA, N…
$ `dfam_ali-en`           <dbl> NA, NA, NA, NA, NA, NA, N…
$ `dfam_env-st`           <dbl> NA, NA, NA, NA, NA, NA, N…
$ `dfam_env-en`           <dbl> NA, NA, NA, NA, NA, NA, N…
$ dfam_modlen             <dbl> NA, NA, NA, NA, NA, NA, N…
$ dfam_target_description <chr> NA, NA, NA, NA, NA, NA, N…
$ Clust_Cluster           <chr> NA, NA, NA, NA, NA, NA, N…
$ Clust_Target            <chr> NA, NA, NA, NA, NA, NA, N…
$ Clust_Perc_Ident        <dbl> NA, NA, NA, NA, NA, NA, N…
$ Clust_cn                <int> NA, NA, NA, NA, NA, NA, N…

As you can see, the column ltr_age_mya stores the roughly estimated insertion time in million years.

I hope this helps?

ADD COMMENT
0
Entering edit mode
6.6 years ago

Did you mean age estimation of an extant LTR retrotransposon (i.e. time passed since insertion)? If you are interested in a method based on 5'-3' LTR homology, I think you can find a function calculating that from this yet unplublished R tool from my collaborator: https://github.com/HajkD/LTRpred

ADD COMMENT
0
Entering edit mode

interesting. thanks for sharing.

though this sentence does not sound encouraging:

age estimation of predicted LTR retrotransposons in Mya (not implemented yet, but soon to come..)

ADD REPLY

Login before adding your answer.

Traffic: 2333 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6