Statistical technique for mutation to gene expression link
1
1
Entering edit mode
3.2 years ago
SUMIT ▴ 30

I grouped the samples into p53 wildtype and p53 mutated - for approximately 1000 individuals. I have gene expression data (logFC) of each individual present in both mutated and non-mutated groups. Now my aim is to identify the genes that are strongly upregulated under p53 mutation. I want to analyze the link between the mutation and the expression of the gene:

  1. I am wondering what are the appropriate statistical tests for analyzing such relationship?

  2. Should I perform the grouped analysis (all p53 wildtype vs all p53 mutated), or pairwise analysis (single mutated case vs single non-mutated case), then taking the average of the significant value of each pair for finding the associated genes?

Please note that my mutation data is in binary format (-1: mutation and 0: wildtype) and gene expression data as log FC. The row represents the gene name and columns represents the each sample data.

Any advice or pointers would be greatly appreciated.

Thanks in advance.

expression gene statistics mutation association tumor • 1.2k views
ADD COMMENT
0
Entering edit mode

In the similar situation I would do DE analysis between wild-type and mutatnt samples. There are a lot to consider like how samples with mutations in genes with co-occurnce or mutually exclusive relationship with TP53 should be considered in this kind of analysis....

Update : 2021-10-13

See (this and this). They used limma package and a design matrix accounted for all interested variables like mutations to assess the effect of mutations on expression profiles.

ADD REPLY
0
Entering edit mode

Can you say which type of data you have?

I have gene expression data (logFC) of each individual present in both mutated and non-mutated groups

This is what I do not really understand.

ADD REPLY
0
Entering edit mode
3.2 years ago

Differential expression with tools like limma, DESeq2, and edgeR in Bioconductor would help answer the question you are asking directly. The latter two are the better approaches for RNA-seq data, while the former is an approach for microarray data.

You do not have paired samples (where paired means arising from the same individual), so you cannot use a paired-sample analysis.

ADD COMMENT

Login before adding your answer.

Traffic: 2465 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6