Entering edit mode
7 weeks ago
Muhammad
•
0
Dear All, I want to do some comparative genomics analysis using cattle and Buffalo Genome GTF and GFF files. Can some body share the URL for it ? I searched NCBI but i found that refseq file for all of three reference genomes in NCBI uses contig assembly number something like this
NC_037328.1 RefSeq region 1 158534110 . + . ID=NC_037328.1:1..158534110;Dbxref=taxon:9913;Name=1;breed=Hereford;chromosome=1;gbkey=Src;genome=chromosome;isolate=L1 Dominette 01449 registration number 42190680;mol_type=genomic DNA;sex=female;tissue-type=left lung
NC_037328.1 Gnomon pseudogene 207933 217580 . - . ID=gene-LOC112
But I want GTF files with chromsome number someting like this
chr1 ncbiRefSeq transcript 210759 214966 . - . gene_id "LOC112447072"; transcript_id "XR_003035142.1"; gene_name "LOC112447072";
chr1 ncbiRefSeq exon 210759 212235 . - . gene_id "LOC112447072"; transcript_id "XR_003035142.1"; exon_number "1"; exon_id "XR_003035142.1.1"; gene_name "LOC112447072";
chr1 ncbiRefSeq exon 212941 213154 . - . gene_id "LOC112447072"; transcript_id "XR_003035142.1"; exon_number "2"; exon_id "XR_003035142.1.2"; gene_name "LOC112447072";
chr1 ncbiRefSeq exon 214935 214966 . - . gene_id "LOC112447072"; transcript_id "XR_003035142.1"; exon_number "3"; exon_id "XR_003035142.1.3"; gene_name "LOC112447072";
chr1 ncbiRefSeq transcript 217517 257046 . - . gene_id "LOC101903639"; transcript_id "XR_003035135.1"; gene_name "LOC101903639";
chr1 ncbiRefSeq exon 217517 219285 . - . gene_id "LOC101903639"; transcript_id "XR_003035135.1"; exon_number "1"; exon_id "XR_003035135.1.1"; gene_name "LOC101903639";
chr1 ncbiRefSeq exon 229250 229332 . -
where can i find this please share a URL for this NCBI URL or other URL OR some solution for it ? Also an additional Question if i want to work with some genome whose file dont exists how can i found or made it ?
One solution is to find and use tables to convert the RefSeq identifiers to standard chromosome numbers. For example, there's a table here you could use to convert the identifiers in your GTF file to standard chromosome names prefixed with “chr”:
I’m not entirely sure what you’re asking. If a genome file doesn’t exist (for example, if it’s for an organism that hasn’t been sequenced), then there won’t be a way for you to find or generate it.
Also, it looks like this bison release only includes scaffolds and lacks assembled chromosomes (except for the mitochondrial chromosome).