Refseqgene Directly To Gtf?
1
1
Entering edit mode
10.9 years ago
Gabe Rudy ▴ 320

UCSC says they take nighlty dumps of the mRNA sequences from RefSeqGens and BLAT them against their assemblies to generate the RefSeqGenes track.

NCBI does have a mappings file on their FTP site: ftp://ftp.ncbi.nih.gov/refseq/H_sapiens/RefSeqGene/GCF_000001405.25_refseqgene_alignments.gff3

But this only provides a start/stop position along the assembly, not the CDS start/stop and exon boundries.

Is there any way to take NCBI's FTP data and generate a proper GTF file or another location on NCBI that has this data in one form or another?

gtf ncbi genes • 5.6k views
ADD COMMENT
3
Entering edit mode
10.5 years ago
Gabe Rudy ▴ 320

I finally got an answer from Deanna Church on this in response to a blog post we wrote about variant annotation.

Here is the GRCh37 GFF mappings of RefSeqGenes.

GFF3 can be a bit tricky to parse and get to the same fields as GTF, but all the data is there.

Notice also microRNAs are also in there.

ADD COMMENT

Login before adding your answer.

Traffic: 1823 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6