I have a TCGA raw RNA-seq counts (HTseq) for tumor samples. I want to do survival analysis based on gene expression for a particular gene set.
I know that I can download normalized TCGA data with z-scores etc. But as I have a certain set of genes I would like to separate them based on the median expression of a gene - into "UP" (greater than the median) and "DOWN" (lower than the median).
In all similar tutorials or answers here, I see that people do differential analysis using DESeq or EdgeR and use built-in normalization. I do not need to do differential analysis, I just need to find associations between particular genes' expression and overall survival.
Can I use just some simple data normalization like CPM or log normalization on raw RNA-seq counts to do survival analysis as I described above?