gtf file for canFam2 genome version
1
I'm trying to find out gtf file for this version of the canine I looked both ncbi as well as ucsc. I am not able to find the gtf file.
Here when I try to download I don't see the option to download the gtf file
https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000002285.2/
Normally in ucsc there is a folder called genes as we see in case of hg19 https://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/genes/
where it contains the gtf file but that is not present in canFam2 ucsc.
https://hgdownload.cse.ucsc.edu/goldenPath/canFam2/bigZips/
Is there a way which I can find already created gtf for the same version which is canFam2 either from ncbi or ucsc ?
It would be helpful to know if that is possible to download
gtffile
• 226 views
I wrote https://jvarkit.readthedocs.io/en/latest/KgToGff/
It was just a one-shot, I don't have used it much. Please check the results.
$ wget -qO - "https://hgdownload.cse.ucsc.edu/goldenPath/canFam2/database/ensGene.txt.gz" | gunzip -c | \
java -jar dist/jvarkit.jar kg2gff --gtf | head -n 20
chr29 ucsc gene 31843577 31869014 . + . ID "GENE2" ; Name "ENSCAFG00000008510" ; biotype "protein_coding" ; gene_id "GENE2" ; gene_name "ENSCAFG00000008510" ; gene_type "protein_coding" ;
chr29 ucsc transcript 31843577 31869014 . + . ID "ENSCAFT00000013501.3" ; Parent "GENE2" ; Name "ENSCAFT00000013501.3" ; biotype "protein_coding" ; gene_id "GENE2" ; gene_name "ENSCAFG00000008510" ; transcript_id "ENSCAFT00000013501.3" ; transcript_name "ENSCAFT00000013501" ;
chr29 ucsc exon 31843577 31843766 . + . ID "ENSCAFT00000013501%3AE0" ; Parent "ENSCAFT00000013501.3" ; Name "ENSCAFT00000013501" ; biotype "protein_coding" ; gene_id "GENE2" ; gene_name "ENSCAFG00000008510" ; transcript_id "ENSCAFT00000013501.3" ; exon_id "ENSCAFT00000013501%3AE0" ;
chr29 ucsc exon 31862157 31862334 . + . ID "ENSCAFT00000013501%3AE1" ; Parent "ENSCAFT00000013501.3" ; Name "ENSCAFT00000013501" ; biotype "protein_coding" ; gene_id "GENE2" ; gene_name "ENSCAFG00000008510" ; transcript_id "ENSCAFT00000013501.3" ; exon_id "ENSCAFT00000013501%3AE1" ;
chr29 ucsc exon 31865271 31865385 . + . ID "ENSCAFT00000013501%3AE2" ; Parent "ENSCAFT00000013501.3" ; Name "ENSCAFT00000013501" ; biotype "protein_coding" ; gene_id "GENE2" ; gene_name "ENSCAFG00000008510" ; transcript_id "ENSCAFT00000013501.3" ; exon_id "ENSCAFT00000013501%3AE2" ;
chr29 ucsc exon 31868495 31868660 . + . ID "ENSCAFT00000013501%3AE3" ; Parent "ENSCAFT00000013501.3" ; Name "ENSCAFT00000013501" ; biotype "protein_coding" ; gene_id "GENE2" ; gene_name "ENSCAFG00000008510" ; transcript_id "ENSCAFT00000013501.3" ; exon_id "ENSCAFT00000013501%3AE3" ;
chr29 ucsc exon 31868849 31869014 . + . ID "ENSCAFT00000013501%3AE4" ; Parent "ENSCAFT00000013501.3" ; Name "ENSCAFT00000013501" ; biotype "protein_coding" ; gene_id "GENE2" ; gene_name "ENSCAFG00000008510" ; transcript_id "ENSCAFT00000013501.3" ; exon_id "ENSCAFT00000013501%3AE4" ;
chr29 ucsc CDS 31843577 31843766 . + 0 ID "CDS4" ; Parent "ENSCAFT00000013501.3" ; biotype "protein_coding" ; gene_id "GENE2" ; gene_name "ENSCAFG00000008510" ; transcript_id "ENSCAFT00000013501.3" ;
chr29 ucsc CDS 31862157 31862334 . + 2 ID "CDS5" ; Parent "ENSCAFT00000013501.3" ; biotype "protein_coding" ; gene_id "GENE2" ; gene_name "ENSCAFG00000008510" ; transcript_id "ENSCAFT00000013501.3" ;
chr29 ucsc CDS 31865271 31865385 . + 1 ID "CDS6" ; Parent "ENSCAFT00000013501.3" ; biotype "protein_coding" ; gene_id "GENE2" ; gene_name "ENSCAFG00000008510" ; transcript_id "ENSCAFT00000013501.3" ;
chr29 ucsc CDS 31868495 31868660 . + 0 ID "CDS7" ; Parent "ENSCAFT00000013501.3" ; biotype "protein_coding" ; gene_id "GENE2" ; gene_name "ENSCAFG00000008510" ; transcript_id "ENSCAFT00000013501.3" ;
chr29 ucsc CDS 31868849 31868910 . + 2 ID "CDS8" ; Parent "ENSCAFT00000013501.3" ; biotype "protein_coding" ; gene_id "GENE2" ; gene_name "ENSCAFG00000008510" ; transcript_id "ENSCAFT00000013501.3" ;
chr29 ucsc three_prime_utr 31868911 31869014 . + . ID "UTR9" ; Parent "ENSCAFT00000013501.3" ; biotype "protein_coding" ; gene_id "GENE2" ; gene_name "ENSCAFG00000008510" ; transcript_id "ENSCAFT00000013501.3" ;
chr29 ucsc start_codon 31843577 31843579 . + . ID "codon10" ; Parent "ENSCAFT00000013501.3" ; biotype "protein_coding" ; gene_id "GENE2" ; gene_name "ENSCAFG00000008510" ; transcript_id "ENSCAFT00000013501.3" ;
chr29 ucsc stop_codon 31868908 31868910 . + . ID "codon11" ; Parent "ENSCAFT00000013501.3" ; biotype "protein_coding" ; gene_id "GENE2" ; gene_name "ENSCAFG00000008510" ; transcript_id "ENSCAFT00000013501.3" ;
chr3 ucsc gene 72230308 72416756 . + . ID "GENE13" ; Name "ENSCAFG00000015634" ; biotype "protein_coding" ; gene_id "GENE13" ; gene_name "ENSCAFG00000015634" ; gene_type "protein_coding" ;
chr3 ucsc transcript 72230308 72416756 . + . ID "ENSCAFT00000024802.14" ; Parent "GENE13" ; Name "ENSCAFT00000024802.14" ; biotype "protein_coding" ; gene_id "GENE13" ; gene_name "ENSCAFG00000015634" ; transcript_id "ENSCAFT00000024802.14" ; transcript_name "ENSCAFT00000024802" ;
chr3 ucsc exon 72230308 72230403 . + . ID "ENSCAFT00000024802%3AE0" ; Parent "ENSCAFT00000024802.14" ; Name "ENSCAFT00000024802" ; biotype "protein_coding" ; gene_id "GENE13" ; gene_name "ENSCAFG00000015634" ; transcript_id "ENSCAFT00000024802.14" ; exon_id "ENSCAFT00000024802%3AE0" ;
chr3 ucsc exon 72257459 72257619 . + . ID "ENSCAFT00000024802%3AE1" ; Parent "ENSCAFT00000024802.14" ; Name "ENSCAFT00000024802" ; biotype "protein_coding" ; gene_id "GENE13" ; gene_name "ENSCAFG00000015634" ; transcript_id "ENSCAFT00000024802.14" ; exon_id "ENSCAFT00000024802%3AE1" ;
chr3 ucsc exon 72272584 72272708 . + . ID "ENSCAFT00000024802%3AE2" ; Parent "ENSCAFT00000024802.14" ; Name "ENSCAFT00000024802" ; biotype "protein_coding" ; gene_id "GENE13" ; gene_name "ENSCAFG00000015634" ; transcript_id "ENSCAFT00000024802.14" ; exon_id "ENSCAFT00000024802%3AE2" ;
Login before adding your answer.
Traffic: 2614 users visited in the last hour
Are you specifically looking for
canFam2
? Likely because newer versions available now: https://www.ncbi.nlm.nih.gov/datasets/genome/?taxon=9612yes I'm looking for this canFam2 only that for some specific cases I have to use, even though I have the newer version also