RNAseq cell line as a covariate
1
0
Entering edit mode
4.7 years ago

Hi all,

I am working on some RNA sequencing data from multiple cell lines, but with the same deletion across all the cell types. For example, I have the same deletion and wildtype for iPSC, neurons, and neural progenitor cells, totally 18 samples (3 cases and 3 controls each).

I was wondering what your thoughts were on using cell line as a covariate in DESeq2 and aggregating all the data together?

I have separated the cell lines and looked at the correlation between the Z test statistics, and found little to no correlation between cell lines, so it makes me think I shouldn't try combining all the data. But, I would really appreciate any insight and if any experts had any input.

Thank you so much!

rnaseq deseq2 cellline transcriptome • 1.1k views
ADD COMMENT
2
Entering edit mode
4.7 years ago

As with everything in bioinformatics, if you have time, try both approaches. I am not sure that a simple correlation analysis is sufficient to conclude that cell-lines should be analysed together (or not).

I have worked on cell-lines a lot within the past year. Including 'line' as a covariate can help to adjust for the cross-line differences that may exist. However, if your lines are from disparate tissues, like CNS tissues and skin tissues, then that may be too much of a difference for which the model could account.

Also, consider the following: if you normalise the lines separately, then the end results are not quite cross-comparable —certainly not the expression levels— as the lines will not have been normalised together. You could possibly do a meta-analysis at the end, if you choose to normalise them separately, though.

In conclusion: no right or wrong answer here.

Kevin

ADD COMMENT
0
Entering edit mode

Thanks a lot Kevin! Is there any methodology you would suggest for meta-analyzing? I was thinking of using an inverse-variance weighted meta-analysis but I'm unsure whether I would use the overall variance of the Z-statistic or if I should use the logFC SE and find the variance for each gene.

Also, if I had different mutations (all implicated in the same disease) in different genes, but the same cell line, should I use mutation as a covariate? Sorry for all the questions, I'm fairly new to all this.

ADD REPLY
0
Entering edit mode

Meta-analysis is not quite my area but the program that comes to mind is rankProd (in R). I don't know which specific method(s) is / are implemented in rankProd, though.

I cannot really comment on the mutation part. It seems like it is only relevant to one cell-line. If you are interested in performing differential expression across the mutation states, then you will have to include it in the design formula anyway.

ADD REPLY

Login before adding your answer.

Traffic: 1554 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6