Entering edit mode
2.5 years ago
Rob
▴
170
Hey friends, The following code gives me a scatter plot of the correlation between gene expression (y-axis) & my continuous variable (x-axis). It gives me the R coefficient and p-value. I have two problems: 1) I need also to have a q-value (corrected p-value). what should I add to my code to have this? 2) I also want to rank the genes based on R and q-value? I mean I want the plots to be in the order of descending/ascending R & q-values. how is it possible?
ggscatter(ex, x = "variable", y = "Expression",
add = "reg.line",
add.params = list(color = "blue", fill = "lightgray"),
color = "black", palette = "jco", fill = "lightgray",
#shape = "cyl",
fullrange = TRUE,
rug = TRUE, facet.by = "gene", cor.coef = T,
title = "Correlation Plot",
conf.int = TRUE,
cor.coeff.args = list(),
cor.method = "spearman",
cor.coef.coord = c(NULL, NULL),
cor.coef.size = 4,
)+
geom_vline(xintercept = 157, colour="red", linetype = "longdash")
here is the plot I get (with R & p-value for each gene) :
I think you're reaching the limits of what ggscatter can do. I'd recommend calculating your own q vals, using pivot_longer, factor your gene names in whatever order you want, then switching to ggplot+geom_scatter. Maybe start with one gene until you optimize your plot then add in the facet_wrap.
Thanks @Trivas
What I understadn, I should calculate q-values first, then find R coefficients, then order the genes based on these two, then make the correlation scatter plots. right? as firs step, I tried to get q-value first using the following code:
my data looks like this: the first column (SMA) is my continuous variable.
I didn't get any result.
Two things:
t()
then calculating your qvalues rowwise, egYou might have to do some optimization after transposing (your colnames might become their own column so use
column_to_rownames
as an example) so I'd recommend running each line individually so you understand what each is doing.You're shooting for a dataframe that has 6 columns after pivoting:
The last 3 are going to be redundant within each gene, but ggplot knows how to handle that.
Thank you @Trivas I tried doing this but I got this error:
Error in
mutate()
: ! Problem while computingpval = t.test(1:nrow(data))
. xpval
must be a vector, not ahtest
object. i Did you mean:pval = list(t.test(1:nrow(data)))
? i The error occurred in row 1. Runrlang::last_error()
to see where the error occurred.then I change the line 5 to :
and I got this error:
is there any way to get rid of this error?
This is where I'm returning the reigns to you. At this point you have your data in the correct orientation to perform rowwise calculations. Once finished, you have the framework to alter your data into a format that ggplot2 likes (via
pivot_longer
) and you know how to factor your genes into the order you like. I've also given you a suggestion to optimize your code for a single gene then expand it to work for all of your genes.The error you're getting now is that the results of
t.test
andqvalue
aren't returning a singular numeric value and mutate cannot add a list of values into a cell of a dataframe. You can get around that by doing something liket.test() %>% .[["p.value"]]
. The.
acts as a placeholder for the list and the double brackets are similar to using$
to access an object of a list. In other words, this is equivalent to doing the following:I implore you to think about what you're trying to do because the default function of t.test is a one sample t test which might not be super meaningful for your analysis.