Hi, I was wondering if anyone can provide some suggestion as to what statistical tests I can use for differential gene expression with RPKM values.
I am stuck with RPKM as DESeq or edgeR is not an option for me unfortunately. I have generated some RNA-Seq data from a dozen tissue samples collected from patients with 2 disease phenotypes (n=6 for each group) and I would like to compare my result to the published RNA-Seq data using in vitro systems. Unfortunately the authors of the published paper only provided a table on RPKM values (and singleton as well for each of the tested conditions!). They also did not deposit any of the fastq files into any database so I cannot even re-do the alignment.
In this scenario where I have no raw counts on the published data and uneven group size, is there anyway that I can still reliably compare my data to the existing ones and do differential expression analysis? Any suggestion will be very helpful!
Many thanks!
RPKMs are terrible for statistics. Would it be possible to just analyse your samples and compare the resulting fold-changes/DE genes to those from the published study? That might yield nice results.
I thought about doing FC as well but have another dilemma. What would you use as a cut-off? Presumably I will have to use some arbitrary cut-offs, e.g. if >2 log2FC then highly abundant. Is there any way to make it more objective?
Also, to do FC, will you:
Thanks again!
I am facing a similar situation with data on GEO only being present in log2 FPKM. After you converted to TPM, how did you end up testing for differential expression?
Thanks!