Analysing gene expression across clinical parameters
2
1
Entering edit mode
8.4 years ago
elizabethR ▴ 70

Hi

I have been doing some bioinformatic analysis of TCGA data as an adjunct to my PhD. I am wanting to take this a little further and analyse RNASeq expression data according to clinical parameters also, such as disease stage, age and sex, histological markers of invasion etc. Has anyone done this analysis? What sort of statistical analyses do you use for this? I am assuming that performing ANOVA analysis is not appropriate with such a large dataset with so many multiple comparisons being made across the dataset? Bioinformatic statistics is a very new area to me.

Also could anyone recommend packages to do this type of analysis? I have started using the amazing R studio and TCGAbiolinks but there doesnt appear to be a package in the Bioconductor guide that is suitable for this.

Really grateful for any advice guys :-)

RNA-Seq TCGA Clinical • 2.2k views
ADD COMMENT
0
Entering edit mode

Hi elizabethR I am not very familiar with TCGA data but If you want to do a class comparison test between two or more phenotypes ,first you should preprocess your data(including log transformation,summarization,normalization),if you want to use GEO data,I offer you using InsilicoDB https://insilicodb.com to retrieve preprocessed data, and it provides you a pipeline to do your job simply.In addition limma is a robust package for analyzing gene expression data it could create a liner model for finding markers of each phenotype https://bioconductor.org/packages/release/bioc/html/limma.html

ADD REPLY
1
Entering edit mode
8.4 years ago
elizabethR ▴ 70

Thank you guys, EagleEye that link was very useful. EdgeR manual says you should use raw counts that haven't been normalised because it's normalised and log transformed as part of its mathematical modelling. I've used edgeR to perform differential expression analysis across clinical parameters

ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Hi elizabethR,

I am working on TCGA data of Lung cancer adenocarcinoma. Refereeing to your above question and the comments, as you said that you have used edgeR to perform differential expression analysis across clinical parameters, me also working on the same type of analysis. I want to find the DEG between tumor stages (Stage I to IV) but im not able to find the right way to do so. here is the my post on biostars where i have provided the R code i have used and how i have got the samples from the FireBrowse data.

Not able to get DEG from FireBrowse data using R

I think i'm going wrong at some posint in the analysis, If you have a sample R code, please share it with me or please help me find what is wrong I am doing in the DEG analysis.

Thank you in advance

ADD REPLY
0
Entering edit mode
ADD COMMENT

Login before adding your answer.

Traffic: 1714 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6