I am analyzing RNA-Seq data in a time course dependent manner (5-time points). I set up three replicates for each time point, so total samples are 15. I want to apply this data to fuzzy soft clustering analysis to figure out the cluster genes change according to the time points and explore the enriched GO to each cluster. For doing that, I want to filter the data to focus the analysis on the highest expressed genes. I have removed the genes with zero counts as well as with low expression level.
to remove the genes with zero counts
raw_counts_wn1 <- raw_counts1[rowSums(raw_counts1) > 0, ]
to remove the low expressed genes
library(edgeR) library(HTSFilter) dge <- DGEList(raw_counts_wn1) dge <- calcNormFactors(dge, method = "TMM") dge <- estimateCommonDisp(dge) dge <- estimateTagwiseDisp(dge) filtData1 <- HTSFilter(dge)$filteredData
Before applying the filtered data to soft clustering analysis, I would like to filter the counts more to focus on the genes have ≥ 2 fold change and FDR (0.001). I don't know the right command to accomplish this target on Rstudio. I will be thankful for anyone offer me good suggestions to perform this filteration.
Hi, At first, I'd like to thank you so much for your suggestion. As far as I know, exactTest () need only two groups for making the comparison, but I still know in RNA-Seq analysis and R language handling. I have tried to perform the test for fold change as following exactResults <- exactTest(filtData1) but I got this error message Error in exactTest(filtData1) : At least one element of given pair is not a group. Groups are: 1 In this case, what is the problem? I may miss considering your comment regarding making the comparison to t=0 but I did not exactly know how to add it in exactTest() command.