'Model matrix not full rank'
2
0
Entering edit mode
12 weeks ago
Rish • 0

Dear community,

I am new to R and transcriptomics so please be patient with me. I am having problem creating my DESeq object. It returns an error about modelMatrix. Attached also is my metadata (with FT1 as the control). Can someone help me with this?

Thank you in advance.

enter image description here

enter image description here

deseq2 design matrix • 675 views
ADD COMMENT
0
Entering edit mode

Have you read the section of the vignette titled 'Model matrix not full rank'?

Can you share the code that you are trying to run and provide some more information on what you are doing?

ADD REPLY
0
Entering edit mode

Hi, I am currently reading it.

Here's my code.

countdata <- read.csv("count.csv",
                      header = TRUE,
                      row.names = 1)
head(countdata)

expdesign <- read.csv("design_2.csv",
                      header = TRUE,
                      row.names = 1)
head(expdesign)

all(colnames(countdata) %in% rownames(expdesign))
all(colnames(countdata) == rownames(expdesign))

expdesign$sample_type <- factor(expdesign$sample_type)
expdesign$dpi <- factor(expdesign$dpi)

dds <- DESeqDataSetFromMatrix(countData = countdata,
                              colData = expdesign,
                              design = ~sample_type + dpi + sample_type:dpi)
ADD REPLY
2
Entering edit mode
12 weeks ago

You've made dpi a factor. That for sure means that you will get that error, because group C has only '0' as a dpi value, and that value is not shared in any of the other groups.

See if it will work if you keep dpi a number, not a factor. That's probably what you want, anyway. I don't think you want dpi to be four named groups.

ADD COMMENT
2
Entering edit mode
12 weeks ago

The terms "model matrix" and "rank" originate from linear algebra. In simple terms, when a model matrix does not have full rank, it means that some rows or columns are repetitive or dependent on each other. This lack of unique information makes it impossible to fully separate or distinguish certain variables or effects. For example, if there is no difference between your R and S samples, and doubling the DPI of the R sample also doubles the DPI of the S sample, it indicates that R is a linear combination of S (or vice versa). In such cases, it becomes impossible to differentiate between R and S or calculate differential expressions involving them. To resolve this issue, you could remove one of the offending samples from the data. Also note that newer version of DESeq2 requires at least two replicates for dispersion estimation and will not function without them.

ADD COMMENT

Login before adding your answer.

Traffic: 1380 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6