Question

Using TidyEstimate scores in the Design Matrix of DESeq2

1

Entering edit mode

9 months ago

Abhishek ▴ 10

Hi!

I am trying to compare bulk-RNAseq data from Brain Metastases to the Primary. I performed DGEA using DESeq2 and shrunk the LFC using apeglm.

While investigating the results table, I observe genes such as GFAP (Glial fibrillary acidic protein) which are already unique to the brain/CNS having high LFC in brain compared to primary. I think that this observation might be due to some of the brain tissue samples had low tumor purity (i.e had more of the surrounding normal brain in it). I then used TidyEstimate to predict the stromal and immune infiltration score for reach of the samples to get an insight into the sample's purity.

My question is whether it's logical to now add these two scores as part of my design in DESEq2?

eg: DESeqDataSetFromTximport(genes_results, sampleTable, ~0+batch+immune+stromal+biopsy), where sampleTable has sample level information, genes_results is the gene level abundance estimates from RSEM that were imported using TxImport.

I am worried that since the scores were calculated using the gene expression data for each sample, using the scores in deseq2 might result in non-meaningful output as the scores are linear combinations of the gene expressions.

Thank you in advance for your feedback.

Best, Abhishek

ESTIMATE deseq2 TidyESTIMATE RNAseq • 281 views

ADD COMMENT • link 9 months ago by Abhishek ▴ 10