Hey,
I want to analyze Differential Transcript Usage (DTU) for TCGA isoform/transcript expression data while taking GTEx as "normal" reference. In other words, TCGA "tumor" and GTEx "normal" samples will be compared.
In order to remove variation caused by batch effects, I have to perform some batch correction. Here is a subset of my design matrix:
X sample condition batch
- s1 tumor TCGA
- s2 tumor TCGA
- s3 tumor TCGA
- s4 normal GTEx
- s5 normal GTEx
- s6 normal GTEx
Following error is received when trying to run DEXSeq: "The supplied design matrix will result in a model matrix that is not full rank"
I know that the error is received due to redundancy in my design matrix. But I want help in tackling this issue? Is there any way for me to modify my design matrix so as to avoid this error? Can I use TCGA and GTEx data without batch correction?
Your help will be much appreciated.
why not just use the normal TCGA sample from the same individual?
Firstly, the samples are that of "adjacent" tissues (of cancer patients) and not exactly "normal". Secondly, I want to see the differences at 3 levels (NAT, normal, tumor).