Hello,
I'm new to R and I'm trying to make a MA plot from my DESeq2 results using ggplot2.
I have figured out how to make a MA plot using the following code:
plot_poly <-
all_counts.poly.results %>%
as.data.frame() %>%
ggplot(aes(log2(baseMean), log2FoldChange) +
geom_point(aes(color = pvalue < 0.05), cex = 0.1) +
labs(title = "Poly Torin Treated vs Untreated")
Inside the all_counts.poly.results are EnsGeneIDs I'm interested in labeling on the graph, but there are too many to plot, so want to filter this against an excel file with specific EnsGeneIDs.
For example, I was thinking about setting it up like this with dplyr, but I'm not sure if this is correct.
# contains EnsGeneIDs I want to be plotted
EnsGeneIDs <- read_excel("/Users/kylestangline/Desktop/geneIDs.xls")
filtered_all_counts.poly.results <- all_counts.poly.results %>%
filter(all_counts.poly.results$EnsGeneIDs %in% EnsGeneIDs) # filter only specific EnsGeneIDs
Then use these filtered EnsGeneIDs as labels on the MA plot I made above?
Have you tried running that and it didn’t work? I can’t tell what exactly you’re asking. Do you want to only plot those genes, or do you want to plot all but only label those ones?
Side note: you can just put the bare column name in the filter call without the df$, although I would recommend changing the name of the EnsGeneIDs object so it differs from the column name. Also, if EnsGeneIDs reads in as a dataframe (rather than a vector) you may need to say %in% EnsGeneIDs$V1 (if that’s the column’s name) or convert it to a vector.
Thanks for the reply! I want to plot all the genes, and I only want to label a few (about 10 genes out of the thousands that ggplot2 plots), hence why I wanted to filter my
all_counts.poly.results
dataframe with the excel I read in.