How to subset LTR from Repeatmasker file obtain from UCSC genome browser
1
0
Entering edit mode
3.1 years ago

Hi. I have obtained a tabular file with the co-ordinates for all interspersed repeats from UCSC genome browser but I'm specifically looking for LTR-retrotransposons.

retrotransposons UCSC • 1.8k views
ADD COMMENT
0
Entering edit mode
3.1 years ago
Michael 55k

I am not sure about the format you got, but you should try to get a GFF file if possible. Try the following command first: grep -ie "ltr" | head and see what comes out of this. Assuming you are on Linux/Mac, otherwise you just open it in a text editor, if possible.

ADD COMMENT
0
Entering edit mode

What is the advantage of a GFF file or in which context? The most useful tools exist for BED files, most importantly, bedtools. IGV, UCSC etc all can load BED files.

ADD REPLY
0
Entering edit mode

Also possible, I am not sure what op has anyway, I think GFF might contain more annotation information in order to filter the data. I am guessing OP has this file: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.out.gz

That is original output from RepeatMasker and can be parsed despite it is some whitespace padded format, but I wanted to make sure if it is really that file and would like to know the purpose of the investigation.

ADD REPLY
0
Entering edit mode

Just to update. Thanks. I needed to follow this: UCSC genome browser > table browser > group: repeats > track: repeatmasker > table: rmsk > filter: repname > LTR column

ADD REPLY
0
Entering edit mode

Ok, so is your request solved by this?

ADD REPLY
0
Entering edit mode

Yes. Thanks.

ADD REPLY

Login before adding your answer.

Traffic: 2161 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6