Advice needed on survival analysis using TCGA
1
0
Entering edit mode
23 months ago
CaroH ▴ 10

Hello everyone,

I would like to analyze the following TCGA dataset: TCGA-LUAD. I would like to see if several genes of interest are correlated with a better chance of survival in patients with LUAD.

I managed to retrieve the information needed using TCGAbiolinks. However, after running GDCdownload & GDCprepare, I am a bit lost on how to proceed forward.

I looked at this interesting post: Survival analysis of TCGA patients integrating gene expression (RNASeq) data, but unlike the author, I had to retrieve the dataset using STAR - Counts and I only have one condition which is the samples coming from the Primary Tumor.

Do I need to normalize my data? If so, how should I normalize it?

Thanks for your advice!

Transcriptomics Gene-expression TCGA • 958 views
ADD COMMENT
0
Entering edit mode
23 months ago

Do I need to normalize my data?

Yes, you do

If so, how should I normalize it?

There are different approaches that you may follow e.g. voom function from limma package to normalize the data. Check out this repo, it may help you with your analysis.

With regards to the type of data, you can start with counts coming from STAR, then proceed with normalization as indicated in the abovementioned GitHub repository.

In terms of conditions, my understanding is that you want to do a survival analysis between groups of samples showing "high" and "low" levels of expression for a gene of interest. So this can be a column in your data frame and would be the grouping variable in your analysis.

ADD COMMENT

Login before adding your answer.

Traffic: 2521 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6