But I can't figure out what (% in sequence) (% in genome) (% in genus) (% in Cyanobacteria) and (% in Bacteria) in this table refer to respectively in this table. I can't go on with statistical analysis without knowing the exact meaning of those data. I hope someone could help me with this one.
Unfortunately, there is very little help or documentation available for the COG database. We are reduced to educated guess-work.
Taking the top row, COG J, I'd guess that:
% in genus = percentage of proteins from Acaryochloris that are COG J
% in Cyanobacteria = percentage of proteins from phylum Cyanobacteria that are COG J
% in Bacteria = percentage of proteins from kingdom Bacteria that are COG J
The first 2 columns are less obvious. I'd guess that "% in sequence" might be based on a sum of sequence lengths (coding?) and "% in genome" is percentage of proteins from that genome, but it is not clear at all.
Having said all that: I would not use COG - it is a very old database and is no longer maintained by the NCBI. You can get similar information from KEGG or the IMG (Integrated Microbial Genomes).
Have a look at this page for Acaryochloris marina MBIC11017. If you scroll down there are COGs, KEGG, MetaCyc and a bunch of other useful and exciting stuff.
Thanks.KEGG and IMG may provide much more infomation.As I'm still not familiar with these databases,I wonder if there's a tool which can do functional clustering of all genes from a specific complete microbial genome,just like COG table. Coz I'm focusing on the statistial analysis of the functional clusters among different species,rather than the functional content of a specific genome.
Thanks.KEGG and IMG may provide much more infomation.As I'm still not familiar with these databases,I wonder if there's a tool which can do functional clustering of all genes from a specific complete microbial genome,just like COG table. Coz I'm focusing on the statistial analysis of the functional clusters among different species,rather than the functional content .
Have a look at this page for Acaryochloris marina MBIC11017. If you scroll down there are COGs, KEGG, MetaCyc and a bunch of other useful and exciting stuff.
Thanks.KEGG and IMG may provide much more infomation.As I'm still not familiar with these databases,I wonder if there's a tool which can do functional clustering of all genes from a specific complete microbial genome,just like COG table. Coz I'm focusing on the statistial analysis of the functional clusters among different species,rather than the functional content of a specific genome.
Thanks.KEGG and IMG may provide much more infomation.As I'm still not familiar with these databases,I wonder if there's a tool which can do functional clustering of all genes from a specific complete microbial genome,just like COG table. Coz I'm focusing on the statistial analysis of the functional clusters among different species,rather than the functional content .
I've just read that page and found a bunch of useful tools.Thanks a lot for helping me.