My pipeline so far is hisat2->featureCounts->DESeq2
. I have generated heatmaps after rlog and log2 transformation of the genes with the most variance, which is somewhat meaningful. What I really want to do is compare everything to the control sample and take the genes with the most log fold change in either direction. I've read through the DESeq2 vignette and haven't found a good example of that. Maybe I do this under the design
parameter when running DESeqDataSetFromMatrix()
? So far I've only set the design parameter to ~condition
as I'm a little shaky on how that parameter works.
Maybe this is more of an R problem than a DESeq2 one? Is EdgeR the better tool since it allows you to do some analysis with no biological replicates by setting the dispersion value?
You can do a DESeq2 analysis with no replicates, the stats are just essentially meaningless. As they would be for any other tool or package trying to compare RNA-seq between single samples.
Yes so I can make heatmaps from the log normalized counts and do things like PCA (and I have). My question is more about what other analysis I can do and how I can compare everything to the control sample in DESeq2. For example, say I want a list of most differentially expressed genes vs control sample, starting with featureCounts matrix which I've imported. Currently I'm not comparing everything to the control, but to each other. So I can get the list of genes with most variance with something like:
but that's not as meaningful as the genes that are most different from control.
Running
results(dds)
on the data actually gives an error that DESeq2 no longer supports experiments with only one replicate, so I don't get the nice summary that a well designed experiment would give.Please use the search function and read through what you can find on the BioC support page and google. I understand it is frustrating to analyse underpowered/unpowered experiments but this question really has been discussed like a hundred times before. Please go through the previous contents and see what you can take away from it. Don't be surprised if this question gets closed by a different moderator for the aforementioned reason.
Do you have a particular thread in mind? I have looked at all those pages pretty extensively and none really cover what I'm looking for.
As ATpoint highlights, there is a lot of material / discussion out there. Just search via your search engine of choice. For one, there is the EdgeR manual (see '2.11 What to do if you have no replicates'):
As for other ideas other than heatmaps, etc., I am going to put a question back to you: why did you do the experiment in the first place if you did not even know the analysis plan that was going to be carried out? Perhaps I missed this somewhere in your original question (?) Would running a few cDNA micorarrays not have been better?
I didn't design the experiment, I inherited the data from a previous researcher and want to make use of it. I did read the edgeR manual and I will try to generate useful figures from that next. I guess I should have restated my original question as "How do I view logfold changes vs a control sample with no replicates using DESeq2", it seems like people are misinterpreting my original question.
Perhaps we are mis-interpreting it; however, I, personally, want to put a stop to the propagation of 'noise' in research. Poor experimental design is one of the key reasons why so many published works that research the same thing are not reproducible.