What is the best way to confirm if a mutated gene is more frequent in recurrence group than in baseline?
0
0
Entering edit mode
13 days ago
Lila M ★ 1.3k

Good morning,

I have a data set for different patients and mutated genes in two conditions (recurrence and T0). What I want to compare is if the genes are more frequent at recurrence or T0. I am not considering expression, type of mutation etc. Eg:

# Example data frame
gene_counts <- data.frame(
  gene = c("A", "B", "C", "N", "T", "X", "Y", "Z"),
  Recurrence = c(10, 5, 0, 100, 1, 0, 1, 0),
  T0 = c(50, 10, 4, 150, 0, 1, 0, 1)
)

Now, my question for asses this comparison is, should I check for frequencies to define the type of test (F-fisher or Chi) basedon the frequencies as follow:

# Function to perform the appropriate test based on counts
perform_test <- function(recurrence, t0) {
  contingency_table <- matrix(c(recurrence, t0, sum(gene_counts$Recurrence) - recurrence, sum(gene_counts$T0) - t0), nrow = 2)

  if (any(contingency_table < 5)) {
    # Use Fisher's test if counts are low
    p.value <- fisher.test(contingency_table)$p.value
  } else {
    # Use Chi-square test otherwise
    p.value <- chisq.test(contingency_table)$p.value
  }

  return(p.value)
}

# Apply test for each gene
gene_counts <- gene_counts %>%
  rowwise() %>%
  mutate(p_value = perform_test(Recurrence, T0)) %>%
  ungroup()
gene_counts <- gene_counts %>%
  mutate(adj_p_value = p.adjust(p_value, method = "BH"))

Or is there any other more straight forward way to do this? e.g is there a good practice or standardised statistical test for this approach rather than to check every case?

P.S Not all the genes are necessarily present in all the samples

Thank you!!!

statistics compare_groups gene_count • 577 views
ADD COMMENT
0
Entering edit mode

Those are really good tools, however, I am not sure if suitable for my approach? As what I have is the counts per gene per condition within a population, not directly related with specific mutations. Would a permutation test be beneficial in this case?

gene Recurrence  T0
1    A         10  50
2    B          5  10
3    C          0   4
4    N        100 150
5    T          1   0
ADD REPLY
0
Entering edit mode

As what I have is the counts per gene per condition within a population,

yes, I'm not a specialist, but I think the tools above don't use the specific variations, just looking how many people in case/control population have least one zero rare allele (missense...) in a gene.

ADD REPLY
0
Entering edit mode

Thank you, The idea for this is just to compare if there are differences in the frequency of a mutated gene in a population at two different time points. I am struggle try to find the best model to do this comparison... Hopefully someone here might have more experience!

ADD REPLY

Login before adding your answer.

Traffic: 1640 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6