DESeq2 design matrix
2
1
Entering edit mode
5.1 years ago
mickey_95 ▴ 110

Hi,

I am trying to use DESeq2 for differential analysis of ATAC-seq data. I have performed peak calling, obtained a consensus peak data set and, using featureCounts, generated a count matrix with my samples as columns and the peak coordinates as rows. I have two conditions, 4 biological and 2 technical replicates each; so 16 samples in total. I am interested in the differences between my two conditions, so I was using DESeq2::DESeqDataSetFromMatrix with

design = ~condition

Now I am wondering about two things: 1. Should I collapse my technical replicates before running the DE pipeline? 2. Should I also include my biological replicates in the design formula? And also an interaction term (condition:BioRep)?

I am very new to this, so apologies if this is a very basic question. Appreciate input and help!

DESeq2 ATAC-seq DE analysis • 2.7k views
ADD COMMENT
1
Entering edit mode

For collapsing technical replicates in DESeq2 workflow, use collapseReplicates (https://www.rdocumentation.org/packages/DESeq2/versions/1.12.3/topics/collapseReplicates) function.

ADD REPLY
1
Entering edit mode
5.1 years ago
dsull ★ 6.9k

Yes, I would pool technical replicates together since you really are just interested in the biological reproducibility.

Your biological replicates don't need to be in the design formula. You have two conditions and each biological replicate will be assigned to one of the two conditions (e.g. 4 treated and 4 control).

ADD COMMENT
1
Entering edit mode
5.1 years ago
ATpoint 85k
  1. Should I collapse my technical replicates before running the DE pipeline

If technical means sequencing the exact same DNA on different lanes or flow cells then yes, collapse them. You can run plotPCA after normalizing the count matrix with vst to see if they indeed cluster close together (or are more or less identical) to confirm there is nothing odd going on.

  1. Should I also include my biological replicates in the design formula? And also an interaction term (condition:BioRep)? No, the replication itself is listed in the colData that describes your experiment like:
Sample  Condition
foo_rep1    condition1
foo_rep2    condition1
foo_rep3    condition1
bar_rep1    condition2
bar_rep2    condition2
bar_rep3    condition2

The design would be ~Condition and this already contains the replication information. I suggest you go through the DESeq2 (and maybe the edgeR) manual as these have extensive examples on how to make designs plus some useful background infmrmation on differential analysis.

ADD COMMENT
0
Entering edit mode

Thank you for the quick reply! My technical replicates cluster closely together in a PCA plot, so I will collapse them. I ran de DESeq2 pipeline with the design you suggested and am very surprised to see that no differentially accessible regions pop up. This seems very strange to me and I do not understand what I might be doing wrong. I have been through the manuals of both DESeq2 and edgeR and understand the basics of the theoretical concepts behind them.

ADD REPLY
0
Entering edit mode

Well, the reason might be that either there are indeed no DEGs or 2) there is not enough power (=not enough replicates). Can you show the outputs of plotPCA and plotMA? What kind of samples are this, primary or cell line, species, setup?

ADD REPLY
0
Entering edit mode

These are primary cells from mice, control and drug-treated. The samples are all over the place in the PCA plot, there is no clear clustering of the two groups (ctrl and treated). I guess this variability could be the explanation?

ADD REPLY
0
Entering edit mode

Yes, that sounds reasonable towards why there are no DEGs. A look at the MA-plot could help decide if there might be a chance to get DEGs at higher replicate numbers.

ADD REPLY
0
Entering edit mode

I don't have access to the data at the moment, I will try to upload a picture later during the week. But in the meantime: How would I assess whether the experiment is underpowered using the MA plot?

ADD REPLY

Login before adding your answer.

Traffic: 2172 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6