Question

Tale as old as time: lmFit dimension issues

1

Entering edit mode

4.1 years ago

leonmcswain ▴ 10

I have seen some other posts on here but after going through them Im not finding a solution. I am using limma to do DE analysis on 763 patient microarrays, 4 groups total. My expression object is a matrix with rownames as genes and colnames as patient ID.

When I try to run the code I get the error:

> fit <- lmFit(as.numeric(Merge_DF_Avg2), design)
Error in lmFit(as.numeric(Merge_DF_Avg2), design) : 
  row dimension of design doesn't match column dimension of data object

The dimensions seem correct:

> dim(design)
[1] 763   4

> dim(Merge_DF_Avg2)
[1] 20341   763

> class(Merge_DF_Avg2)
[1] "matrix" "array"

Here is the code:

#Data wrangle for limma
>Merge_DF_Avg <-readRDS("C:/Users/12298/Desktop/Data_Analytics/Taylor_2017/Finalized_Expression_Array_avgs.rds")
>Merge_DF_Avg2 <- Merge_DF_Avg %>% na.omit() %>% pivot_wider(values_from=avg, names_from=Gene_Name) %>% t() %>% janitor::row_to_names(row_number = 1)
>Patient_Cat <- as.vector(as.numeric(Merge_DF_Avg2[1,]))
>Merge_DF_Avg2 <- Merge_DF_Avg2[-c(1,2),] #Taking out patient cat and unidentified gene rows

#limma design
>design <- model.matrix(~ 0 + factor(Patient_Cat))
>colnames(design) <- c("SHH", "Group3", "Group4", "WNT")
>fit <- lmFit(as.numeric(Merge_DF_Avg2), design)

I am using the example code provided by the limma package pdf to guide me.

R • 2.2k views

ADD COMMENT • link updated 4.1 years ago by dariober 15k • written 4.1 years ago by leonmcswain ▴ 10

1

Entering edit mode

In general, please make it a habit in the future to provide the data using dput (at least a small chunk of it) to allow reproduction of the problem.

ADD REPLY • link 4.1 years ago by ATpoint 85k

0

Entering edit mode

I already have an answer to the post but I would like to know more about this. When I post should I just copy and paste the output from dput? For this matrix its a bit chaotic since there are 763 columns. The output gets cut off in the console because its so long.

ADD REPLY • link 4.1 years ago by leonmcswain ▴ 10

1

Entering edit mode

A small subset of the data is a good idea so one can actually run the code you provide. Yes, just copy paste the dput, one can then easily paste this back into R.

ADD REPLY • link 4.1 years ago by ATpoint 85k

3

Entering edit mode

4.1 years ago

Gordon Smyth ★ 7.7k

Edit

I'm rewriting my answer from yesterday because I've had a closer look at your data wrangling code. Initially, I was unclear why you would not just run lmFit in the usual way with:

fit <- lmFit(Merge_DF_Avg2, design)

On closer look, I see now that your Merge_DF_Avg2 object is almost certainly a data.frame where every column is a character vector and that presumably is why you're running as.numeric. My original answer was to point out that as.numeric returns a dimensionless vector, which obviously causes the dimension error.

If it was me, I would revisit your earlier code by which the expression data was stored and wrangled so that conversion to character didn't occur in the first place. That should be easy to avoid, for example by not making the column names the first row of the data.frame. Character strings are a really poor way to store numeric expression values. But it's up to you. The code from dariober will work, but to me it's repairing a problem that shouldn't have been introduced in the first place.

ADD COMMENT • link 4.1 years ago by Gordon Smyth ★ 7.7k

1

Entering edit mode

I think the as.vector bit works as it should. The problem should be with as.numeric(Merge_DF_Avg2)

ADD REPLY • link 4.1 years ago by dariober 15k

0

Entering edit mode

dariober was correct about this.

ADD REPLY • link 4.1 years ago by leonmcswain ▴ 10

1

Entering edit mode

OK, my original answer was confusing because I wrote as.vector where I meant to write as.numeric. It was as.numeric(Merge_DF_Avg2) that I was refering to.

ADD REPLY • link 4.1 years ago by Gordon Smyth ★ 7.7k

0

Entering edit mode

I am a Cancer Biologist by training so I still have a bit of work to do regarding R basics (re not introducing these issues to begin with)

ADD REPLY • link 4.1 years ago by leonmcswain ▴ 10

score 2 · Accepted Answer · 2020-10-13

2

Entering edit mode

4.1 years ago

dariober 15k

I think as.numeric(Merge_DF_Avg2) in lmFit converts a character matrix to a vector. To convert the numeric vector back to matrix you could do (check it's ok!):

matrix(as.numeric(as.matrix(Merge_DF_Avg2)), ncol= ncol(Merge_DF_Avg2))

ADD COMMENT • link 4.1 years ago by dariober 15k

0

Entering edit mode

This worked perfectly! Thank you! You were also correct about the vector it doesn't matter if I state that because the vector goes into design matrix not directly into lmFit. -Leon

ADD REPLY • link 4.1 years ago by leonmcswain ▴ 10