Problem with loacting gtf file
2
0
Entering edit mode
7.0 years ago
elisheva ▴ 120

Hi everyone!
I have this expression file (UCSC genes, hg19):

tracking_id FPKM    FPKM    0   FPKM_status FPKM_status
DDX11L1 0.0220335   0.014392    0.0182127   OK  OK
WASH7P  0.325992    0.242878    0.284435    OK  OK
MIR6859-1   0   0   0   OK  OK
FAM138A 0.00576753  0.00565091  0.00570922  OK  OK
OR4F5   0   0   0   OK  OK
LOC729737   0.11037 0.134682    0.122526    OK  OK
LOC100132287    0   0   0   OK  OK
LOC100133331    0   0   0   OK  OK
OR4F29  0   0   0   OK  OK
MIR6723 0   0   0   OK  OK
OR4F29  0   0   0   OK  OK

I tried get all these genes sequences, but couldn't find then.
I downloaded a GTF file of UCSC genes from the table browser.
But the genes names are different and there are no "genes" in the file at all.
Only : CDS, exon, start_codon, 4 stop_codon
The GTF file I got looks like:

chr1    hg19_knownGene  exon    11874   12227   0.000000    +   .   gene_id "uc001aaa.3"; transcript_id "uc001aaa.3"; 
chr1    hg19_knownGene  exon    12613   12721   0.000000    +   .   gene_id "uc001aaa.3"; transcript_id "uc001aaa.3"; 
chr1    hg19_knownGene  exon    13221   14409   0.000000    +   .   gene_id "uc001aaa.3"; transcript_id "uc001aaa.3";

Does anybody know where can I find proper file?

gtf ucsc gene • 1.6k views
ADD COMMENT
1
Entering edit mode
7.0 years ago
Tm ★ 1.1k

Hello,

Looking at the expression file, it seems that you have obtained it after using reference transcriptome pipeline like tophat-cufflinks. Your expression file also have Gene symbols in tracking_id column, which means you have used genome annotation file (.GTF) along with the reference genome fasta file while mapping.

Then why you are not using genome annotation file (.GTF) used during mapping to extract gene cordinates (based on Tracking ID) which can further be used for fetching gene sequences from reference genome using bedtools?

ADD COMMENT
0
Entering edit mode
7.0 years ago
GenoMax 148k

You can generate an annotation file in GTF format for UCSC hg19 genome by following the directions in this post.

ADD COMMENT

Login before adding your answer.

Traffic: 1792 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6