Hello! I am working on a project that involves state annotation of mouse genomic data using ChromHMM. The goal is to identify distinct chromatin states across various tissues or cell types, based on histone modification profiles. My dataset includes Cut&Tag data for histone marks such as H3K27ac, H3K4me3, and H3K36me3, across multiple conditions.
After performing the binarizedbed and LearnModel steps for 12 chromatin states, I would like to know how to effectively annotate these states. Specifically, how can we assign biological meanings to these states, such as promoters, enhancers, or transcriptional regions? What approaches can I take to ensure that the annotations are biologically relevant and consistent with existing knowledge or experimental data?
Any advice on interpreting these states in the context of chromatin architecture, and how to refine these annotations for different tissues or conditions, would be greatly appreciated.
could you please guide me how i can use chromHMM output files for 12 states as a input for this tool.
the idea is two-fold:
you could either demonstrate some common dynamics across your developmental process, see the image from the paper from the dev of ChromHMM himself:
here we see that the chromatin states for H3K4me2 and H3K4me1 from MEF to the late ESC (shown as yellow-to-red squares on the top) follow a certain "staircase" pattern.
Secondly if you happened to include the H3K9ac or some other 'active transcription' type of mark (like seen here on the left), you can also conclude that on the later stages of dedifferentiaion (ESCs columns) your chromatin is very similar to the 'active transcription' type of chromatin accessibility
In the paper titled 'H4K16ac activates the transcription of transposable elements and contributes to their cis-regulatory function,' the authors present a bar plot in which they annotate various chromatin states. I have utilized uniquely aligned reads in .bam files, called peaks for both stringent and relaxed conditions, and then used stringent peaks for the ChromHMM analysis. Could you please guide me on how to generate a similar bar plot as shown in the paper, with appropriate annotations for the chromatin states?"
well, what they did is clearly they ran ChromHMM against a set of
bed
files (names in the columns), with at least 12 states as the number of states parameter. The annotation of individual states is a bit tricky as I am not sure how they did that from the quick glance over the paper, but one is for sure: after having run the ChromHMM they took the*_segments.bed
output file from ChromHMM and just counted how many peaks in the originalbed
files belong to each of the segments, giving the percentages shown.The annotation of the states part you can figure out yourself, i hope
Thank you for the detailed and informative suggestions. I followed your instructions and successfully generated the desired plot. However, I noticed that the annotation states differ from those in the original reference graph. Could you provide guidance on how to align the annotations to match the original graph more accurately?
It looks like you are using a different color pallette for the state labels. It might help to copy the color pallette from the original plot you are trying to recapitulate.
the figure doesn't load for some reason
heres the output figure
You're using a different color scheme, which makes it difficult to compare figures at a glance. For that same reason, it may also help to order the columns the same way.
Regardless, they do look like two different figures. You might post a second question with your code and a more detailed outline of your process, because there may not be enough information here to help and the comments section may be difficult for having an extended discussion.