Refseqgene Directly To Gtf?
1
UCSC says they take nighlty dumps of the mRNA sequences from RefSeqGens and BLAT them against their assemblies to generate the RefSeqGenes track.
NCBI does have a mappings file on their FTP site: ftp://ftp.ncbi.nih.gov/refseq/H_sapiens/RefSeqGene/GCF_000001405.25_refseqgene_alignments.gff3
But this only provides a start/stop position along the assembly, not the CDS start/stop and exon boundries.
Is there any way to take NCBI's FTP data and generate a proper GTF file or another location on NCBI that has this data in one form or another?
gtf
ncbi
genes
• 5.6k views
•
link
updated 3.2 years ago by
Ram
44k
•
written 10.9 years ago by
Gabe Rudy
▴
320
I finally got an answer from Deanna Church on this in response to a blog post we wrote about variant annotation.
Here is the GRCh37 GFF mappings of RefSeqGenes.
GFF3 can be a bit tricky to parse and get to the same fields as GTF, but all the data is there.
Notice also microRNAs are also in there.
•
link
updated 3.2 years ago by
Ram
44k
•
written 10.5 years ago by
Gabe Rudy
▴
320
Login before adding your answer.
Traffic: 1823 users visited in the last hour