this is a follow-up from this Question.
I have created a plot using the ggplot2 package. But as the matrix is very large (almost 146000 rows), the single cells of the image are quite small.
I would like to know how to make the single cell sizes bigger, so I will get a better overview inside the image about the differences between the different cells.
I also would like to know if it is possible to make a bigger (longer) lengend of more than just five elements? I would like to create at least 20 different coloured groups of distinguish different colors
( a logical question - why does it plot it in a triangle?)
This is how I create the plot:
pl1 <- ggplot(data, aes(y = partner1, x = partner2)) + geom_tile(aes(fill = Substract)) + scale_fill_continuous(low = "blue", high = "green") + scale_size(range = c(1, 20000))
By "legend", do you instead mean tick labels? The actual legend is the color bar on the right.
BTW, the answer to your logic question is that your data is in a triangular shape (well,
is). This is likely due to how you generated those values.Edit: You can test what I mentioned above regarding shape with
with(data, table(parner1>partner2))
.Actually I do mean the bar at the side of the image :-)
If you just want more colors then try
.I wonder whether making such a heatmap of your data is the best way to vizualize the results in the first place. In the heatmap, genomic positions are converted to non-numeric values which makes it hard to see the relative distances along the genome. Furthermore, I think there is too much data per pixel and the distribution of "substraction-values" is very scewed towards the lower numbers which makes it hard to see the different colors (maybe you should log-transform the substraction-values).
But what do you really want to show to the viewer? The things I can think of are:
In any case, it is very hard to answer with the tile-plot you are trying to make.
In case of 1 and 2, make a karygram overview of the distribution (histogram) of interactions along the chromosomes. Then use
geom_histogram() + facet_wrap(~chr,...)
In case of 3, compure the pairwise correlation coefficients between all positions and make a How Do I Draw A Heatmap In R With Both A Color Key And Multiple Color Side Bars?
It is not really continuous genomic positions, but bins of 1000 (or 5000) positions summarized into one value. And I do have only one chromosome. I thought a heatmap will be better, because of the better (coloured) overview. (Is there a way to do a histograms with different colours for specific values ranges?). I will try the histogram - BTW did you mean karyogram (Is it a GRanges Object?).