Understanding the ColData Matrix
1
2
Entering edit mode
3.6 years ago

I am a student analyzing differential gene analysis between two stimuli using DESEQ2 and I just want to understand how the colData matrix interacts with the raw counts data. What is the logic behind it and what is the criteria to make an accurate colData matrix?

RNA-SEQ R Micheal_Love DESEQ2 • 3.7k views
ADD COMMENT
2
Entering edit mode
3.6 years ago

Row names of the colData should match the column names of the counts matrix, each respective row or column representing a sample. colData itself is just a dataframe, so each column represents a field to be associated with each sample (e.g. age, gender, genotype, treatment_condition, etc). These fields can then be utilized in the design and grouping/annotating samples in visualizations.

ADD COMMENT
0
Entering edit mode

What colData is just one column?

ADD REPLY
0
Entering edit mode

If you have 5 control samples, and 5 treated samples, that's fine. You just need a column for sample names, and a column for treatment.

ADD REPLY
0
Entering edit mode

Like so:

> sample_info <- data.frame(condition = c(rep("SNF2",5), rep("WT",5)), row.names = names(readcounts) )
> sample_info
      condition
SNF2_1 SNF2
SNF2_2 SNF2
SNF2_3 SNF2
SNF2_4 SNF2
SNF2_5 SNF2
WT_1 WT
WT_2 WT
WT_3 WT
WT_4 WT
WT_5 WT
ADD REPLY

Login before adding your answer.

Traffic: 1646 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6