Hi I have set of genes with GO IDs. I obtained this data using CAMERA workflow for "Metagenomic data annotation and clustering ". Now I want "to get a rough idea how these annotated sequences are distributed in different gene categories". For this aspect CateGOrizer was good but this tool gives information for all 3 categories in a single pie diagram. It will be better to have 3 different pie for each category. I tried REViGO but it dont give the visual representation for all available sequences but only few selected sequences. I have around 8000 sequences. can anybody please give me a suggestion to visualize the data as I want? You can see the data below!
thank you very much raghul
#query GO reference DB reference family e-value description
contig00965.4 GO:0006412 TIGRFAM TIGR00001 2e-29 translation
contig00965.4 GO:0022625 TIGRFAM TIGR00001 2e-29 cytosolic large ribosomal subunit
contig00965.4 GO:0000315 TIGRFAM TIGR00001 2e-29 organellar large ribosomal subunit
contig00965.4 GO:0003735 TIGRFAM TIGR00001 2e-29 structural constituent of ribosome
contig37137.2 GO:0006412 TIGRFAM TIGR00001 2e-21 translation
contig37137.2 GO:0022625 TIGRFAM TIGR00001 2e-21 cytosolic large ribosomal subunit
contig37137.2 GO:0000315 TIGRFAM TIGR00001 2e-21 organellar large ribosomal subunit
contig37137.2 GO:0003735 TIGRFAM TIGR00001 2e-21 structural constituent of ribosome
contig00611.6 GO:0006412 TIGRFAM TIGR00001 9.5e-20 translation
contig00611.6 GO:0022625 TIGRFAM TIGR00001 9.5e-20 cytosolic large ribosomal subunit
contig00611.6 GO:0000315 TIGRFAM TIGR00001 9.5e-20 organellar large ribosomal subunit
So you want to make a pie-diagram where each piece of the pie is one of the unique GO-terms (GO:0006412, GO:0000315), the size of the splices corresponds to the amount of contigs in that GO-term, and then make such a pie-chart for each unique reference family? But in you example table, there is only 1 reference family so making one pie-chart seems logical here ...
Or do you mean each GO category: Biochemical Process, Molecular Function and Subcellular categorization?
I have shown here only few lines of data out of over 8000 lines. I just want to know the viewers to have an idea about the data I have. The 3 categories I mention is Biological process, Cellular component, Molecular function