Entering edit mode
3.4 years ago
fifty_fifty
▴
70
I want to do survival analysis for some genes expression using TCGA data. I want to find genes which up- or downregulation are associated with survival. I am doing it as a validation part of my work which included both LUAD and LUSC cancers. What is the best way to combine two datasets for two different diseases, e.g. LUAD and LUSC? I have got my data from cbioportal
thank you
thank you! I used your tutorial for survival analysis, very helpful, thanks a lot. I performed survival analysis separately for both datasets. I am thinking if it worth combining the data. What would be the biggest potential criticism if I decide to integrate the datasets?
Well, LUAD (adenocarcinoma) and LUSC (squamous cell carcinoma) are different diseases.
I understand that, but they both are non-small cell lung cancers, which I am working with. So, I thought if my initial analysis were done for NSCLC, then shouldn't my validation data include both LUAD and LUSC?
Sure, why not just proceed and do it both ways? - that's how we would do research. When it's done both ways, then we will know if survival differs across LUAD and LUSC. It likely will differ for the expression of certain genes.