http://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/simpleRepeat.txt.gz
Does anyone know how these regions were defined.
I am actually guessing it is from the Tandem Repeats Finder program based on this.
But not sure.
http://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/simpleRepeat.txt.gz
Does anyone know how these regions were defined.
I am actually guessing it is from the Tandem Repeats Finder program based on this.
But not sure.
OP here this is what I got as a response from UCSC
Thank you for your question about the source of the simpleRepeats file for the human assembly hg38.
As you noted, the "simpleRepeats" file is produced using the Tandem Repeats Finder (TRF) program. More information about how data for tracks is produced can be found on the schema description that you noted or through the track configuration/description page: https://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg38&g=simpleRepeat.
it is described in the URL you provided (?) https://genome.ucsc.edu/cgi-bin/hgTables?db=hg19&hgta_group=rep&hgta_track=simpleRepeat&hgta_table=simpleRepeat&hgta_doSchema=describe+table+schema
This track displays simple tandem repeats (possibly imperfect repeats) located by Tandem Repeats Finder (TRF) http://tandem.bu.edu/trf/trf.submit.options.html which is specialized for this purpose. These repeats can occur within coding regions of genes and may be quite polymorphic. Repeat expansions are sometimes associated with specific diseases.
Unless I missed it, its not unambiguously referring to that table being the contents of http://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/simpleRepeat.txt.gz
I mean, I would strongly suspect it but don't want to tell someone that is the case if it is not.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
For very specific questions like this it would be best to email UCSC support (genome at soe.ucsc.edu ) directly.
Solid advice, I'll try to come back with an answer regardless