I am trying to analyze data that has three different timepoints (0, 6, and 12 hr) and two conditions (treated and control), but every attempt I make to carry this out with DESeq2 is met with error.
The latest is below, which is an attempt I made using this brief tutorial
counts<-read.csv("E:/Mac_data/macs_counts_11_3_21.csv", header=TRUE)
head(counts)
X Merged_KO_0hr.bam Merged_KO_12hr.bam Merged_KO_6hr.bam Merged_WT_0hr.bam
1 0610005C13Rik 4 3 2 6
2 0610006L08Rik 0 0 0 0
3 0610009B22Rik 1040 509 663 1082
4 0610009E02Rik 14 9 10 8
5 0610009L18Rik 151 62 99 125
6 0610010F05Rik 2524 699 1604 2121
Merged_WT_12hr.bam Merged_WT_6hr.bam
1 1 2
2 0 0
3 540 644
4 15 8
5 45 50
6 683 1318
counts_matrix <- data.matrix(counts)
head(counts_matrix)
X Merged_KO_0hr.bam Merged_KO_12hr.bam Merged_KO_6hr.bam Merged_WT_0hr.bam
[1,] 1 4 3 2 6
[2,] 2 0 0 0 0
[3,] 3 1040 509 663 1082
[4,] 4 14 9 10 8
[5,] 5 151 62 99 125
[6,] 6 2524 699 1604 2121
Merged_WT_12hr.bam Merged_WT_6hr.bam
[1,] 1 2
[2,] 0 0
[3,] 540 644
[4,] 15 8
[5,] 45 50
[6,] 683 1318
#set exp design and coldata
exp_design_file <- file.path("mac_exp_design_11_3.csv")
exp_design <- read.csv(exp_design_file, stringsAsFactors = FALSE)
head(exp_design)
sample Condition Timepoint
1 Merged_WT_0hr.bam treated 0h
2 Merged_WT_6hr.bam treated 6h
3 Merged_WT_12hr.bam treated 12h
4 Merged_KO_0hr.bam control 0h
5 Merged_KO_6hr.bam control 6h
6 Merged_KO_12hr.bam control 12h
head(coldata)
DataFrame with 6 rows and 3 columns
sample Condition Timepoint
<character> <character> <character>
1 Merged_WT_0hr.bam treated 0h
2 Merged_WT_6hr.bam treated 6h
3 Merged_WT_12hr.bam treated 12h
4 Merged_KO_0hr.bam control 0h
5 Merged_KO_6hr.bam control 6h
6 Merged_KO_12hr.bam control 12h
#DESeq2
full_model <- ~ sample + Condition + Timepoint + Condition:Timepoint
reduced_model <- ~ sample + Condition + Timepoint
dds <- DESeqDataSetFromMatrix(countData = counts, colData = coldata,
+ design = ~ sample + Condition +
+ Timepoint + Condition:Timepoint)
Error in DESeqDataSetFromMatrix(countData = counts, colData = coldata, :
ncol(countData) == nrow(colData) is not TRUE
I would appreciate any help. I understand WHAT the error message is saying, but I don't know how to fix it.
Additionally, if there is a better way to handle this kind of data I would also appreciate feedback in that regard. I'm here to learn!
fair enough. do you think this is a good place to start?
http://master.bioconductor.org/packages/release/workflows/html/rnaseqGene.html
Sure. That one can be a bit overwhelming in the beginning, just note that it goes over a lot of different ways to import data; you'll only need one such method at a time.
I figured it out :) Ty for reminding me to slow down and start with the basics instead of jumping right in with my own data