Transposable Elements positions in genome (GTF file)
2
2
Entering edit mode
9.7 years ago

Hi All

I need, a GTF file including known Transposable Elements positions in sheep genome. But unfortunately I could not find any things yet. My question is:

Is there any database for depositing Transposable Elements of mammalian genome or how can I find these data (for example for sheep, cattle or human genome)?

Best

Transposable-elements genome annotation • 8.0k views
ADD COMMENT
6
Entering edit mode
6.1 years ago
mollitz ▴ 90

The Hammel lab provides GTF annotations:

GTF files for gene annotation can be obtained from UCSC RefSeq, Ensembl, iGenomes or other annotation databases. GTF files for TE annotations are customly generated from UCSC RepeatMasker or other annotation database. They contain two custom attributes, class_id and family_id, corresponding to the class (e.g. LINE) and family (e.g. L1) of the corresponding transposable element. A unique ID (e.g. L1Md_Gf_dup1) is also assigned for each TE annotation in the transcript_id attribute. Pre-generated TE GTF files are available for a number of organisms, and can be downloaded here. If the organism or genome build of your interest is not available, please contact us and provide a curated annotation of the transposable elements (e.g. genomic location and TE name/type). We will do our best to help you generate the suitable TE GTF file. (http://hammelllab.labsites.cshl.edu/software/#TEtranscripts)

You can download them here: http://labshare.cshl.edu/shares/mhammelllab/www-data/TEtranscripts/TE_GTF/

ADD COMMENT
0
Entering edit mode

Hey,

5-6 years have passed. Is there one for mouse genome from Ensembl, GRCm39 specifically? The Gene Annotation (GTF) for it is Mus_musculus.GRCm39.112.gtf. But, is there one for TE annotation (GTF)?

Thanks, and these answers here are very helpful.

ADD REPLY
3
Entering edit mode
9.7 years ago

Transposable (aka, "mobile") elements are categorized as repeats in most genomes, so you'll need to start by downloading the repeatmasker track from UCSC. How you proceed with that will depend completely on (A) what you want to do and (B) what's known about the biology. In humans and mice, for example, it's known that some ERVs are still mobile, while others seem to not be. The simplest was to get a list of this is to filter repeat masker tracks by homology, length, etc.. You can find an example of that for the mouse here. That's based on a some human settings that someone else came up with (I can probably dig up the reference if needed). Not all of the candidate regions will be mobile, but it's a useful list to start with.

Of course, if it's unknown what, if any, families of repeats are still mobile in sheep then the best you can do is some filtering of some likely candidate families based on other organisms.

ADD COMMENT
0
Entering edit mode

Thanks for your answer. then your mean is, there is not a file (in GTF format) which determine positions of Alu, SINE, LINE, LTR and ... in genome (even in human)?

I have some SNPs and want to know which of them are in these regions.

ADD REPLY
0
Entering edit mode

You can download that from UCSC. But most Alu/SINE/etc. aren't mobile.

ADD REPLY
0
Entering edit mode

I see, Thanks for taking your time

ADD REPLY

Login before adding your answer.

Traffic: 1953 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6