Hello,
Short answer: http://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/genes/mm10.refGene.gtf.gz
Long answer:
Due to the way the Table Browser forms queries, the Table Browser GTF output repeats the gene_id and transcript_id fields as such:
chr1 mm9_refFlat stop_codon 3206103 3206105 0.000000 - . gene_id "Xkr4"; transcript_id "Xkr4";
This is why we denote that output as "GTF (limited)". We have a wiki page for how to accomplish this properly (http://genomewiki.ucsc.edu/index.php/Genes_in_gtf_or_gff_format) which comes down to using a separate utility for the conversion. Another reason this may have been confusing, is you did not see the same reFlat table available on the Table Browser. This is because in mm10/hg19/hg38, NCBI started releasing coordinates along with their annotation sequences. This means that to get the equivalent of your selection for mm10, you would use the following:
Assembly: mm9
Group: Gene and Gene prediction tracks;
Track: NCBI RefSeq;
Table: UCSC RefSeq (refGene)
Output format: GTF (limited)
Like refFlat, these are our own alignments of the NCBI sequences. However, due to the limited output you will not have the gene name (included in refFlat) unless you follow the wiki conversion.
We also have begun to offer these proper GTF files in our downloads directory. Here it is for mm10: http://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/genes/
The equivalent you will want to use will be http://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/genes/mm10.refGene.gtf.gz
If you have further questions, you can reach us at genome@soe.ucsc.edu. It may take us a little longer to answer questions on biostars.
Hi Luis, What about the human? Can you share the gtf link for hg19 and hg38?
Yes, we are still in the process of making them available for all of our assemblies.
hg38 GTFs: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/genes/
hg19 GTFs: http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/genes/
And what's the difference between refGene and ncbiRefSeq gtf?
The difference is the dataset they were sourced from. You can read about these different tracks in the description page (http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg38&g=refSeqComposite).
ncbiRefSeq - RefSeq All – all curated and predicted annotations provided by RefSeq.
refGene - UCSC RefSeq – annotations generated from UCSC's realignment of RNAs with NM and NR accessions to the human genome. This track was previously known as the "RefSeq Genes" track.
Essentially ncbiRefSeq contains all transcripts including predicted. For refGene we pull out only the NM_* and NR_* sequences (mRNA and RNA) and we align them ourselves to the genome using BLAT. See this for NCBI prefixes (https://www.ncbi.nlm.nih.gov/books/NBK21091/table/ch18.T.refseq_accession_numbers_and_mole/?report=objectonly). Removing these computationally predicted transcripts cuts the table nearly in half. hg38 refGene has 82,864 items and ncbiRefSeq has 166,923 items. You may also find this similar question helpful: A: RefGene: how to find the starts and ends of genes?