What is this annotation file format?
1
0
Entering edit mode
9.4 years ago
cfarmeri ▴ 210

Hello, ALL!

I got the following annotation file from UCSC website.

(rmsk.txt in http://hgdownload.soe.ucsc.edu/goldenPath/mm10/database/)

However I don't classify this file format. I usually use gtf files as annotation files.

607     12955   105     9       10      chr1    3000000 3002128 -192469843     -       L1_Mus3 LINE    L1      -3055   3592    1466    1
607     1216    268     31      105     chr1    3003152 3003994 -192467977     -       L1Md_F  LINE    L1      -5902   617     1       2

If you know about this file format and how to convert this file to gtf format, please tell me!!

Thank you ALL!

RNA-Seq • 7.1k views
ADD COMMENT
3
Entering edit mode

Did you read the README?

To see descriptions of the tables underlying Genome Browser annotation
tracks, select the table in the Table Browser:
  http://genome.ucsc.edu/cgi-bin/hgTables?db=mm10
and click the "describe table schema" button.  There is also a "view
table schema" link on the configuration page for each track.
ADD REPLY
0
Entering edit mode

FYI, according to the README,

Files included in this directory (updated nightly):
  - *.sql files:  the MySQL commands used to create the tables
  - *.txt.gz files: the database tables in a tab-delimited format
    compressed with gzip.
ADD REPLY
1
Entering edit mode
9.4 years ago

That's the format produced by repeatMasker. If you want a more useful format, then use the table browser to get a BED or GTF file (though note that both of these incur some information loss, since the table browser doesn't end up including everything).

ADD COMMENT

Login before adding your answer.

Traffic: 2595 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6