Hi,
I have a data sheet representing the gene expression of 11 samples. For each sample, the expression of each gene is reported for Tumor and Normal Tissues(22 combined columns for each gene). For each gene, I calculated the fold change of Tumor vs Normal among each sample. My question is how can I select some genes which are significantly overexpressed or underexpressed in all samples.
One simple approach is to calculate the mean and variances of fold changes among all samples. Then select the genes which have large mean and small variance. But I think there should be better approaches to do it maybe by calculating the p-values.
Once again, my question is how can I scientifically report a gene as a gene which has differentially expressed in the Tumor samples vs Normal samples.
If this is RNA-seq, why are not you using common tools such as edgeR or DESeq2?
Yes, I have bulk RNA-seq data, I started using edgeR. It worked, thank you.
So for a given normal tissue, you have no idea what the natural variance is?
I would probably use DESeq2 or edgeR for this. Is this RNA-seq? If so, use DESeq2 or edgeR and then establish a cutoff such as log2FC > 1 and adjusted P value < 0.05 to filter upregulated or log2FC < -1 and adjusted P value < 0.05 to filter downregulated genes.
It's not clear to me that the poster's experimental design is suitable for DESeq2 or edgeR.