How do I create a gff(3) file from a genebank file?
1
1
Entering edit mode
3.6 years ago
torkel.loman ▴ 10

Hello

Background I'm trying to analyse a RNA-seq experiment of Bacillus subtilis PY79. As part of that, I need to create an ensemble database using ensembldb (https://bioconductor.org/packages/release/bioc/html/ensembldb.html). For this I need a gff file of the genome. I tried to download it from NCBI, however, I get an error because that gff file lacks "gene_id". Since I cannot find any other gff file of that subspecies, I am now trying to generate it from the gb file.

The Problem I have a genebank (gb) file which I have downloaded from NCBI (https://www.ncbi.nlm.nih.gov/nuccore/NC_022898.1?report=genbank then send to -> file -> GeneBank (full)). I wish to convert it to a gff3 file. I have attempted several things, but no succeeded.

What I've Tried

gff genebank • 2.6k views
ADD COMMENT
0
Entering edit mode

The GFF file for this strain does have gene identifier. You should be able to use that for your counting using featureCounts. This is bacterial RNAseq so things are simpler. Align with aligner of your choice and then use featureCounts with -g gene option. If you choose this file then be sure to get the corresponding genome fasta file to create your indexes. That way all identifiers will match.

ADD REPLY
1
Entering edit mode
3.6 years ago

This usecase is one of the many for which I wrote the bio package. Get the file:

bio fetch NC_022898

convert the Genbank file to GFF like so:

bio convert  NC_022898 --gff > annotations.gff

See more here:

https://www.bioinfo.help/bio-gff.html

note how much nicer and prettier the gene models made with bio are (plus are fully compatible with featureCounts):

Disclaimer: the package is still under heavy development and has not been sufficiently tested

ADD COMMENT
0
Entering edit mode

Istvan Albert I tried to use bio gff to convert a genebank file into a gff, it produced this error: TypeError: '>' not supported between instances of 'NoneType' and 'int'

The gene bank accession number is EF489041.1 Thanks!

ADD REPLY
0
Entering edit mode

open an issue or a discussion on the source code website so that this can be fixed ther

https://github.com/ialbert/bio/discussions

ADD REPLY

Login before adding your answer.

Traffic: 2477 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6