Background: I am mining some data using the Limma package to look at differentially expressed genes.
I have imported the expression data such that values of 0 are NA.
y <- read_excel("exprs.xlsx", na="0")
My phenotypic data looks like this:
> pData
tissue donor
AV3 AV 3
AV4 AV 4
AV8 AV 8
MV3 MV 3
MV4 MV 4
MV8 MV 8
PV3 PV 3
PV4 PV 4
PV8 PV 8
TV3 TV 3
TV4 TV 4
TV8 TV 8
So 3 donors (3,4,8), 4 tissue types (AV,MV,PV,TV).
group <- factor(pData$tissue)
design <- model.matrix(~0+group)
contrast.matrix <- makeContrasts(
mv_vs_tv = MV-TV,levels=design)
fit <- lmFit(y, design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2, trend=T)
This will give me differential expressed genes between MV and TV.
I would now like to pull out the genes that are expressed exclusively in MV and not TV, AV or PV. By this I mean there is 0 or "NA" in all groups except MV, in any donor.
For example if donor 3 shows expression of "geneX" in MV but donor 4 and 8 do not, and all donors have NA expression in AV, PV or TV, I would like this to be reported. As well as if 2 or 3 donors show geneX expression in MV but not any other tissues.
Can anyone suggest some code that would do this, within or outside f Limma? I tried a few code chunks but am struggling to get it to work.
Many thanks.
I understand, I guess this is more of a coding question. I am not sure what is wrong with the data, but it seems to be working fine in this case. This is the only format I can get the data in for the moment. I will look into getting the data further upstream so I can QC it.