Question

How to produce "publication quality figures" for MAGs from environmental samples, and visualize functional annotation results?

0

Entering edit mode

11 months ago

jway • 0

I have been using the metagenome-atlas (https://metagenome-atlas.readthedocs.io/en/latest/index.html#) workflow to QC, assemble, bin, and annotate 12 different metagenome samples. Our lab collected some environmental samples and enriched them for certain types of bacteria/archaea. Each type of bacteria/archaea received one treatment and one control. We are trying to see if there's any difference between the treatment and control group, and in general, learn more about the metabolism and genetic information about the organisms in the enriched microbial community. The project is government sponsored so I can't reveal too many details, I apologize for the lack of specifics.

After running the workflow, I now have the .faa and .fna files of the 57 MAGs that were produced from these 12 samples. These are the predicted genes and translated protein sequences. I also have the functional annotations for each genome, in the form of a table of all the Kegg modules (the workflow used DRAM to annotate the genomes with functional annotations and infer the Kegg modules) and a table of all the annotations.

My lab has no prior experience in bioinformatics, and my supervisor has told me to produce "publication quality figures" to visualize the metagenome analysis data. I've been provided with some papers that did similar studies, and have been told to produce similar figures with very little direction or guidance on how to do this. I am not familiar with R, Python, or MATLAB. What are some visualization tools or software I could use to make sense of the functional annotation results and the MAGs that were produced? What do "publication quality figures" entail, and how do I go about producing these?

figures visualization publication • 1.3k views

ADD COMMENT • link updated 11 months ago by young_bioinformatician ▴ 240 • written 11 months ago by jway • 0

1

Entering edit mode

No matter what we tell you about it, your supervisor may or may not be happy with it. They should be telling you what the expectation is. They should also be providing support for you to do this work, because it is not easy to summarize large amounts of metagenomic data without prior experience.

ADD REPLY • link 11 months ago by Mensur Dlakic ★ 28k

0

Entering edit mode

I just clarified with them again and they said they want the taxa of the bacteria/archaea, genome completeness of abundant microbes, and the difference in expression levels of metabolic genes between the treatment and control group. They also want metabolic pathway identification of those genes. Unfortunately I don't have much support for this, the lab is spread too thin for anyone to work with me on this, and our company doesn't have another bioinformatic group (or any other scientific group actually) for us to collaborate with

ADD REPLY • link 11 months ago by jway • 0

score 1 · Answer 1 · 2024-01-09

Unfortunately, your question is very broad, and your PI would be well advised to strike up a collaboration with another (bioinformatic) group, if nobody in the lab has any experience in analysing such data. The risk of misinterpretation or wasting a lot of time and money on validating leads from a poor quality dataset / wrong analysis strongly increases without prior experience.

Broadly speaking, a publication quality figure needs to meet technical requirements and also convey the message & highlight the findings that you wish to share with the research community. As far as I understand from your post, you are still very much in the exploratory phase and still need to make up your mind what to report in a future publication.

As far as the technical requirements are concerned, you will find the required information on the journal's website. Usually, there are requirements on the minimal print resolution, font face and size, colours suitable for visually impaired viewing etc.

Microbiology is unfortunately not my field of experience, but I think anvi'o might be a suitable tool to create most figures in this domain. Take for example this Nature article, whose authors published all required data to reproduce the figures. Like so, readers can recreate the figure, interactively zoom in or choose a different display style.

score 0 · Answer 2 · 2024-01-19

I really do not like this kind of supervisor. Your supervisor probably even do not know what they want and they probably think bioinformatic stuff can be easily implemented and they push you under unnecessary stress.. Sorry I have had a similar type of supervisor and your post just reminded me her...

Here is my suggestion:

In addition to anvi'o, you can check MGnify pipeline
Check some bioinformatic workshop, like Physalia course. You can learn some visualization methods and functional metagenomics -> Metagenomic for microbial communities workshop
Check the metagenomics publications on GitHub that share their code, then you can follow the pipelines.

Nevertheless, you need a bioinformatic mentor, otherwise It will be like crossing the ocean in a small boat for you...