Hi,
When I use the DESeqDataSetFromMatrix function from the DESeq2 package in the following way:
DESeqDataSetFromMatrix(df,colData =metadata, design = ~location+time+age+sex+group)
I get the following error:
Error in checkFullRank(modelMatrix): the model matrix is not full rank, so the model cannot be fit as specified.
One or more variables or interaction terms in the design formula are linear
combinations of the others and must be removed.
My metadata object looks like this:
x, (cancergroup or control) group, location (of sampling), (sampling) time, age, sex
Sample 1,0,first_location,<12h,4,M
Sample 2,0,first_location,<12h,7,M
Sample 3,0,first_location,<12h,6,F
Sample 4,0,first_location,<12h,6,M
Sample 5,0,first_location,<12h,2,M
Sample 6,0,first_location,<12h,5,M
Sample 7,0,first_location,<12h,2,M
Sample 7,1,second_location,>24h,2,M
Sample 8,1,second_location,>24h,2,M
Sample 9,1,second_location,>24h,4,F
Sample 10,1,second_location,>24h,2,M
Sample 11,1,second_location,>24h,3,M
Sample 12,1,second_location,<12h,2,F
Sample 13,1,second_location,<12h,5,F
I transformed the ages into factors, for example: a 70 year old patient will get factor level 7, 60 year old will get 6 etc. Also, I factorized the groups: cancer is 0 and control is 1. All other columns also have factors in them.
I think the problem is that some samples/variables have similar information in their rows such as sample 5 and 7. I have seen multiple similar posts to mine but I still don't understand how to solve this problem.
Can anyone help?
Hi, thanks for responding. I actually have more samples (88 cancer samples and 88 control samples from different locations, with different ages etc). Do you have any idea how I should proceed? From what I understand from the vignette I have to make a balanced design. But would it be allowed for samples to have the same column values as long as not all column values are the same? For example, sample 1 and sample 2 are from patients with different ages but have otherwise similar column values so is this allowed?
No, I don't think that should be a problem, but you may want to check that the design matrix doesn't have any columns that are all zero in it.
As in
design_matrix <- model.matrix(~location+time+age+sex+group, metaData)