Question

What is the best way to confirm if a mutated gene is more frequent in recurrence group than in baseline?

0

Entering edit mode

12 days ago

Lila M ★ 1.3k

Good morning,

I have a data set for different patients and mutated genes in two conditions (recurrence and T0). What I want to compare is if the genes are more frequent at recurrence or T0. I am not considering expression, type of mutation etc. Eg:

# Example data frame
gene_counts <- data.frame(
  gene = c("A", "B", "C", "N", "T", "X", "Y", "Z"),
  Recurrence = c(10, 5, 0, 100, 1, 0, 1, 0),
  T0 = c(50, 10, 4, 150, 0, 1, 0, 1)
)

Now, my question for asses this comparison is, should I check for frequencies to define the type of test (F-fisher or Chi) basedon the frequencies as follow:

# Function to perform the appropriate test based on counts
perform_test <- function(recurrence, t0) {
  contingency_table <- matrix(c(recurrence, t0, sum(gene_counts$Recurrence) - recurrence, sum(gene_counts$T0) - t0), nrow = 2)

  if (any(contingency_table < 5)) {
    # Use Fisher's test if counts are low
    p.value <- fisher.test(contingency_table)$p.value
  } else {
    # Use Chi-square test otherwise
    p.value <- chisq.test(contingency_table)$p.value
  }

  return(p.value)
}

# Apply test for each gene
gene_counts <- gene_counts %>%
  rowwise() %>%
  mutate(p_value = perform_test(Recurrence, T0)) %>%
  ungroup()
gene_counts <- gene_counts %>%
  mutate(adj_p_value = p.adjust(p_value, method = "BH"))

Or is there any other more straight forward way to do this? e.g is there a good practice or standardised statistical test for this approach rather than to check every case?

P.S Not all the genes are necessarily present in all the samples

Thank you!!!

statistics compare_groups gene_count • 562 views

ADD COMMENT • link 11 days ago by Lila M ★ 1.3k

0

Entering edit mode

rvtest ? http://zhanxw.github.io/rvtests/

skat ? skat-o ? https://cran.r-project.org/web/packages/SKAT/SKAT.pdf

regenie ? https://rgcgithub.github.io/regenie/

ADD REPLY • link 11 days ago by Pierre Lindenbaum 164k

0

Entering edit mode

Those are really good tools, however, I am not sure if suitable for my approach? As what I have is the counts per gene per condition within a population, not directly related with specific mutations. Would a permutation test be beneficial in this case?

gene Recurrence  T0
1    A         10  50
2    B          5  10
3    C          0   4
4    N        100 150
5    T          1   0

ADD REPLY • link 11 days ago by Lila M ★ 1.3k

0

Entering edit mode

As what I have is the counts per gene per condition within a population,

yes, I'm not a specialist, but I think the tools above don't use the specific variations, just looking how many people in case/control population have least one zero rare allele (missense...) in a gene.

ADD REPLY • link 11 days ago by Pierre Lindenbaum 164k

0

Entering edit mode

Thank you, The idea for this is just to compare if there are differences in the frequency of a mutated gene in a population at two different time points. I am struggle try to find the best model to do this comparison... Hopefully someone here might have more experience!

ADD REPLY • link 11 days ago by Lila M ★ 1.3k