Question

Using Eggnog-mapper to categorize genes according to OG, without gene prediction.

0

Entering edit mode

5.5 years ago

Hansen_869 ▴ 80

Hi! I have recently used Prokka on a set of bins, to annotate their genes. In order to make the dataset of annotated genes more manageable, I'd like to categorize all the genes, according to their function. I haven't decided, how narrow the categorization should be yet. I have been told that EggNOG-mapper should be able to do this, however, when i read about EggNOG, it looks like it also annotates genes. As that would be double work, I'd like to ignore that function and only categorize the already annotated genes. Is this possible? Or am I better off not running Prokka and just use EggNOG? (Maybe I'm wrong, but I have had a hard time interpreting their github page).

Thanks!

eggnog prokka annotation • 5.9k views

ADD COMMENT • link updated 5.5 years ago by Carambakaracho ★ 3.3k • written 5.5 years ago by Hansen_869 ▴ 80

score 0 · Answer 1 · 2019-10-29

0

Entering edit mode

5.5 years ago

Carambakaracho ★ 3.3k

the eggnog mapper does no ab initio prediction, it annotates provided genes/proteins. You'll get additional annotation to the one prokka offers and can categorise based on that.

ADD COMMENT • link 5.5 years ago by Carambakaracho ★ 3.3k

0

Entering edit mode

Ah that makes more sense! So all I have to provide to EggNOG, is the protein sequence file (.faa) from the Prokka output? The reason why I'm asking and not trying, is due to the vast databases I have to download. If EggNOG is not right, it would be a lot of work for nothing.

ADD REPLY • link 5.5 years ago by Hansen_869 ▴ 80

0

Entering edit mode

I used the diamond database for a quick shot and then refined with more specific databases

ADD REPLY • link 5.5 years ago by Carambakaracho ★ 3.3k

0

Entering edit mode

Whats the difference between HMMER and Diamond?

ADD REPLY • link 5.5 years ago by Hansen_869 ▴ 80

0

Entering edit mode

DIAMOND- Accelerated BLAST compatible local sequence aligner https://github.com/bbuchfink/diamond
HMMER - biosequence analysis using profile hidden Markov models http://hmmer.org/

ADD REPLY • link 5.5 years ago by GenoMax 151k

0

Entering edit mode

Thanks!! So they seem kind of similar. Why would I use one over the other?

ADD REPLY • link 5.5 years ago by Hansen_869 ▴ 80

0

Entering edit mode

the hmmer version gave me more results, as far as I recall. Diamond was way faster though

ADD REPLY • link 5.5 years ago by Carambakaracho ★ 3.3k

0

Entering edit mode

I am currently using EggNOG with diamond now. My file is about 2MB. As of now, diamond has been running for 50 minutes with no results (on 4 cores). Do you recall if this is the usual time, or might there be an issue?

ADD REPLY • link 5.5 years ago by Hansen_869 ▴ 80

0

Entering edit mode

Depends what database you are running against?

ADD REPLY • link 5.5 years ago by GenoMax 151k

0

Entering edit mode

My shell command looks like this: python emapper.py -i input-file --output output-file -m diamond.

It states that when using diamond, no database should be specified on EggNOGS github.

ADD REPLY • link 5.5 years ago by Hansen_869 ▴ 80