RNA-Seq time series analysis using ImpulseDE2 software
2
1
Entering edit mode
5.8 years ago
stu111538 ▴ 80

Hi,

I want to use ImpulseDE2 software to identify genes that are differentially expressed over time (I have samples from 12 patients at 4 time points). I have two questions and unfortunately trying to contact the developers failed.

1) Does somebody know whether ImpulseDE2 can deal with missing data? For some patients one timepoint is missing. There is no error message when I run the software, but I am concerned whether ImpulseDE2 can deal with this. For me it is not obvious from the published benchmark article (https://www.ncbi.nlm.nih.gov/pubmed/30102402).

2) I do not understand the ImpulseDE2 result table. If I perform the analysis with boolIdentifyTransients = TRUE, I get some genes where is.transient=TRUE and some more genes where is.monotonous=TRUE, but some genes with padj <<0.05 have FALSE in both fields. I thought the algorithm tests tree possibilities: monotonous, transient and constant. However, according to the result table some genes are none of those three possibilities. How should I interpret those genes? Are they false positives with small p-values, but without successful fit of a transient or monotonous course?

I appreciate every tip. Thanks in advance!

RNA-Seq ImpulseDE2 missing data transient genes • 3.9k views
ADD COMMENT
0
Entering edit mode

Yes it seems that impulsede2 can handle missing data. According to its manual: "matCountData in runImpulsede2(): includes read count data, unobserved entries are NA."

But I have a question for you. How have you normalized your data? As I know, this model does not implement any normalization method.

ADD REPLY
0
Entering edit mode

Ok, actually this is good to know. However, what I meant with missing data was that data for a whole time point for one or more patients is missing. I think the manual extract means missing data in terms of count data for particular genes are missing.

To your question: As I understood it from the very recent article from Fischer et al. (https://www.ncbi.nlm.nih.gov/pubmed/30102402) ImpulseDE2 includes a normalization:

"Reference methods We used ImpulseDE2, DESeq2 and DESeq2splines on rounded expected count matrices (Supplementary Notes Section S5). We used DESeq2 in the log-likelihood ra- tio test mode in all cases. We used ImpulseDE, edge and limma on scaled data, where the scaling factor is deter- mined as the DESeq2 size factor (2). Therefore, the same library size normalization was used for ImpulseDE2, DE- Seq2, ImpulseDE, edge and limma...."

What do you think? In case you are uncertain about normalization I would recommend to do the normalization using DESeq2 like this:

    # in my case I import HTSeq count data
ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = ~ 1) 
ddsHTSeq <- ddsHTSeq[ rowSums(counts(ddsHTSeq)) > 1, ]
ddsHTSeq <- estimateSizeFactors(ddsHTSeq) # this is the normalization that is also meant in the article
counts.sf_normalized <- counts(ddsHTSeq, normalized=TRUE) # make count matrix for ImpulseDE2 from this
ADD REPLY
0
Entering edit mode

Following up this thread: Have you find any answer for your second question?

ADD REPLY
1
Entering edit mode
5.4 years ago
ashMC ▴ 10

Just to add to this conversation in regards to 1), in the manual (you can find it on Bioconductor) it explicitly says the following:

"Missing values are not supported. Genes having missing values for at least one sample will be excluded from the analysis."

This may be why you are not getting any error messages!

ADD COMMENT
0
Entering edit mode
4.4 years ago
Josephine • 0

Hi, how about your second question, have you figured it out? I encountered the same problem and I don't know how to explain it....

ADD COMMENT

Login before adding your answer.

Traffic: 2752 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6