Hi,
I'm working with a data set that is missing a lot of data due to quality issues. Therefore many of the transcript FPKM values are scored as 0. As a result, this appears to confound the significance matrix and I end up with thousands of genes marked as significant at alpha=0.05.
What I would like to do is filter the cuffset to exclude those values which are 0 across all samples (rows) or which are 0 in the query sample.
My current approach is a round about way of generating a filtered cuffgeneset but the sigMatrix()
function only has an implementation for cuffset objects so I cannot generate the matrix with the cuffgeneset.
My strategy is as follows:
> library(cummeRbund)
> cuff<-readCufflinks()
#get gene matrix for all
> gene.matrix<-fpkmMatrix(genes(cuff))
#score for any row where all values are 0, or query samples are 0
> test <- apply(gene.matrix, 1, function(x) all(x[1:5]==0) | x[7] == 0 | x[11] == 0)
#apply to matrix
> test1 <- gene.matrix[!test,]
#get significantly regulated genes
> mySigGeneIds<-getSig(cuff,alpha=0.05,level='genes')
#get common list of gene names that are significant and where value of query is not 0
> test4 <- Reduce(intersect, list(mySigGeneIds,rownames(test1)))
#build a gene set of those that are significantly regulated and for which we have a value for query
> myGenesSig<-getGenes(cuff,test4)
I'm wondering if you know of a way to apply a filter to the cuffset object to get a subset cuffset instead. Or is there a way to generate a sigmatrix from a cuffgeneset?
Thanks for your time,
-David
Was this every answered? I would also like to know how to do this. Thank you.