I have RNA-Seq data for 116 tumor and 67 normal samples. Aprrox. 56k genes. Differential analysis can be done b/w tumor and normal with edgeR/ Deseq2.
Among those 56k genes, I'm particularly interested in looking at expression of 30 genes between tumor and normals.
So, for that do I need to subset the counts data of those 30 genes and then do differential analysis?
(OR)
Differential analysis should be done for all 56k genes and then check whether those interested 30 genes are differentially expressed or not?
I tried this to ways. When I did differential analysis with those 30 genes I found some genes differentially expressed in which I'm interested in. But when I did differential analysis with all the 56k genes, I didn't find those genes in which I'm interested in.
What is the right way to do if I want to look at expression of some specific genes?
And what if I want to look the expression of a single gene between tumor and Normal - do I still need to go with edgeR/DEseq2 or t-test is better?
Is it same when I wanted to check expression of a single gene b/w tumor and normal? Do you recommend t-test if it is a single gene between tumor and normal?
Yes it is the same. You should always use all genes for the analysis. And then check the genes of interest.
Thanks a lot for the quick answer.
Hi Nicolas,
One general question. If some wet lab scientists asked for the expression of specific genes in tumor and normal samples should I give them counts data of those genes or RPKM, FPKM, CPM, logCPM?
normalized read count should be ok. Best would be to plot the results as a boxplot for example. check DESeq2 vignette for more information : https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#differential-expression-analysis
Sure. how to normalize count data in edgeR? And for boxplot can I plot using logCPM?