How to convert gbk file to roary tool acceptable gff3 format?
0
0
Entering edit mode
4.2 years ago
Kumar ▴ 120

I would like to carryout pan-genome analysis using roary, so I need to convert gbk files to roary compatible gff/gff3 files (must have fasta sequence at the end of each gff files) example file for roary gff format. I am aware that prokka can generate roary compatible gff3 file format, but the problem is I have large number of datasets downloaded from ncbi. Therefore, I would like to proceed pan-genome analysis with ncbi annotations. Could anyone please suggest me any tool to do the same. Thanks in advance.

linux bash python perl • 3.3k views
ADD COMMENT
0
Entering edit mode

Roary homepage says to use a perl script...

On NCBI's website, GFF3 files only contain annotation and not the nucleotide sequence so cannot be used. You need to download the GenBank files plus nucleotide sequence and convert them. When downloading, click on the show sequence option, Update View then Send to a File of type GenBank. You can then use the Bio::Perl script bp_genbank2gff3.pl to convert to GFF3. Just be aware that mixing different gene prediction methods and annotation pipelines can give noisier results. Alternatively, you can use ncbi-genome-download to pull down the FASTA files and convert them to GFF3 with Prokka.

I prefer the prokka way...

ADD REPLY
0
Entering edit mode

See if this helps...

ADD REPLY

Login before adding your answer.

Traffic: 1366 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6