Hi,
I want to use ImpulseDE2 software to identify genes that are differentially expressed over time (I have samples from 12 patients at 4 time points). I have two questions and unfortunately trying to contact the developers failed.
1) Does somebody know whether ImpulseDE2 can deal with missing data? For some patients one timepoint is missing. There is no error message when I run the software, but I am concerned whether ImpulseDE2 can deal with this. For me it is not obvious from the published benchmark article (https://www.ncbi.nlm.nih.gov/pubmed/30102402).
2) I do not understand the ImpulseDE2 result table. If I perform the analysis with boolIdentifyTransients = TRUE, I get some genes where is.transient=TRUE and some more genes where is.monotonous=TRUE, but some genes with padj <<0.05 have FALSE in both fields. I thought the algorithm tests tree possibilities: monotonous, transient and constant. However, according to the result table some genes are none of those three possibilities. How should I interpret those genes? Are they false positives with small p-values, but without successful fit of a transient or monotonous course?
I appreciate every tip. Thanks in advance!
Yes it seems that impulsede2 can handle missing data. According to its manual: "matCountData in runImpulsede2(): includes read count data, unobserved entries are NA."
But I have a question for you. How have you normalized your data? As I know, this model does not implement any normalization method.
Ok, actually this is good to know. However, what I meant with missing data was that data for a whole time point for one or more patients is missing. I think the manual extract means missing data in terms of count data for particular genes are missing.
To your question: As I understood it from the very recent article from Fischer et al. (https://www.ncbi.nlm.nih.gov/pubmed/30102402) ImpulseDE2 includes a normalization:
"Reference methods We used ImpulseDE2, DESeq2 and DESeq2splines on rounded expected count matrices (Supplementary Notes Section S5). We used DESeq2 in the log-likelihood ra- tio test mode in all cases. We used ImpulseDE, edge and limma on scaled data, where the scaling factor is deter- mined as the DESeq2 size factor (2). Therefore, the same library size normalization was used for ImpulseDE2, DE- Seq2, ImpulseDE, edge and limma...."
What do you think? In case you are uncertain about normalization I would recommend to do the normalization using DESeq2 like this:
Following up this thread: Have you find any answer for your second question?