Hi all, I've analyzed the HTSeq count data of TCGA and have used the DESeq2 for normalization. As you know there are a small number of normal samples in TCGA for each cancer type. Is it acceptable to perform the DEA with these unbalanced samples (in terms of number)?
Which one do you recommend?
- Using just all the data of TCGA (e.g. 533 tumor samples and 59 normal).
- Sequestering the same number of tumor and normal samples of TCGA and just analyzing them.
- Using normal samples from another database (e.g GTEx) which has a larger number of samples.
Thanks for any help.