I'm wondering if there's a way we can recover the mean gene expression used for calculating the fold change of scran::pairwiseTTests.
I tried the following code used for calculating fold change using scran::pairwiseTTests
:
library(scuttle)
library(scran)
library(tidyverse)
sce <- mockSCE()
sce <- logNormCounts(sce)
cell_groupings <- colData(sce)$Mutation_Status
names(cell_groupings) <- rownames(sce)
out <- pairwiseTTests(scuttle::logNormCounts(sce), groups=cell_groupings)
out$statistics[[1]] %>% # negative / positive
as.data.frame() %>%
rownames_to_column(var = "genes") %>%
as_tibble() %>%
arrange(desc(logFC))
It produces
genes logFC p.value FDR
<chr> <dbl> <dbl> <dbl>
1 Gene_1740 1.20 0.000617 0.733
2 Gene_0691 0.968 0.00435 0.733
3 Gene_1755 0.932 0.00690 0.733
4 Gene_1460 0.916 0.0107 0.733
5 Gene_0894 0.890 0.0129 0.733
6 Gene_1910 0.847 0.0208 0.761
7 Gene_1839 0.844 0.0168 0.733
8 Gene_1147 0.839 0.00304 0.733
9 Gene_0477 0.832 0.0134 0.733
10 Gene_1201 0.816 0.00643 0.733
Notice that the logFC for Gene_1740 is 1.20 and Gene_0691 is 0.968.
But when I compute manually to get mean expression:
meta_data_df <- colData(sce) %>%
as.matrix() %>%
as.data.frame() %>%
rownames_to_column(var = "barcodes") %>%
as_tibble()
scuttle::normalizeCounts(sce) %>%
as.matrix() %>%
as.data.frame() %>%
rownames_to_column(var = "genes") %>%
as_tibble() %>%
pivot_longer(-genes, names_to = "barcodes", values_to = "gexp") %>%
left_join(meta_data_df, by = "barcodes") %>%
filter(genes %in% c("Gene_1740", "Gene_0691")) %>%
group_by(genes, Mutation_Status) %>%
summarise(mean_gexp = mean(gexp)) %>%
pivot_wider(names_from = "Mutation_Status", values_from = "mean_gexp") %>%
mutate(logFC = log2(negative/positive)) %>%
arrange(desc(logFC))
I get
genes negative positive logFC
<chr> <dbl> <dbl> <dbl>
1 Gene_1740 4.74 3.54 0.421
2 Gene_0691 4.28 3.32 0.370
Notice the difference in logFC. Is there a way I can easily recover the actual expression so that the logFC match
calculated with scran::pairwiseTTests
?
Thanks so much. I thought
logNormCounts
function already included log in it.