LRT test with DESeq2 and how to interpret the results
0
2
Entering edit mode
2.5 years ago
Maka ▴ 20

Dear all, I have a question regarding the way I am interpreting my results. I really hope you can give me a hand. I have four groups that I am analyzing: 1) WT - vehicle 2) WT - treatment 3) Mut - vehicle 4) Mut - treatment

I am trying to get a list of DE genes, affected by the treatment. As a first step, I ran DESeq2, following the official vignette:

    dds <- DESeqDataSetFromTximport(txi,                             
                                    colData = coldata,
                                    design = ~ genotype + condition)

    #set levelels
    dds$condition <- relevel(dds$condition, ref = "vehicle")
    dds$genotype <- relevel(dds$genotype, ref = "WT")


 #I run the deseq function
    dds <- DESeq(dds)

    #and then I got the results using "condition Treat vs vehicle", to get the comparison between treated and untreated 
    res <- results(dds, name = "condition_Treat_vs_vehicle")

    # and then I got the results using "genotype_Mut_vs_WT", to get the comparison between treated and untreated
    res2 <- results(dds, name="genotype_Mut_vs_WT")

In order to understand which genes are affected only by treatment without any genotype effect, I then ran the LRT test.

dds_lrt <- DESeq(dds, test="LRT", reduced = ~genotype)

res_LRT <- results(dds_lrt)

Then I selected the significant genes via padj <0.05 for res_LRT , res, and res2

And I overlapped these lists of genes using a Venn Diagram function to see how many were actually affected only from my treatment.

Now, I doubt that this comparison is actually not correct because I should only use the results from my LRT test, without considering the ones obtained via the Wald test.

It would be great to get more insights on this topic. Please let me know if there is any information missing that might help the answer.

Looking forward to your feedback

Thank you in advance :)

significant LRT DESeq2 genes RNA-Seq • 3.7k views
ADD COMMENT
1
Entering edit mode

Your first contrast "condition_Treat_vs_vehicle" is already looking for the effect of treatment while controlling for genotype. Since you only have two factor levels in condition the LRT test should give you similar results to the Wald test used in results, since the Wald test approximates the LRT, but is designed for pairwise comparisons.

ADD REPLY
1
Entering edit mode

The LRT might be a good choice if you have many pairwise comparisons but still want a single results table to look for genes that change over this large set of groups. For small number of groups, like three or so I would always use the Wald test because it tests against a fold change and therefore allows to filter for a minimal effect size. The LRT compares the goodness-of-fit, not fold changes. For stats people that might be intuitive, for me it never was, my brain is centered on fold changes.

ADD REPLY
0
Entering edit mode

thank you for your replies.. Actually, when I try to compare the significant genes obtained from results LRT, res and res2. I do not have the same overlapping as you can see in the picture Venn Diagram with res = Treat, res2= Geno, all genes selected for padj <0.05

ADD REPLY
1
Entering edit mode

Your LRT model excluding Treatment and your Treatment Wald test results share almost all of the same DEGs, which is as expected.

ADD REPLY
0
Entering edit mode

Oh ok, I see, thanks. So, for my question (DE genes upon treatment), I can focus my attention on the 423 that are affected only by a treatment effect, am I correct? And what about the 6 genes only in the Treatment? I am sorry for the many questions, this is the first set of data I am trying to analyze by myself and I want to make sure to get the right interpretation :)

ADD REPLY
2
Entering edit mode

In your case you don't need to run the LRT. Simply use the results from the treatment contrast. Make sure to use the log fold shrinkage method for DESeq2 also, and ideally setting a log2 fold change threshold within the function itself.

ADD REPLY
0
Entering edit mode

if I have a single factor which Subtype and it has 6 levels which is M0 to M5.

I have done the following

 dds <- DESeqDataSetFromMatrix(countData=rpkm_ordered, colData=coldata, design= ~ FAB)# mod18
ddsLRT <- DESeq(dds, test="LRT", reduced= ~ 1, parallel = TRUE)
resultsNames(ddsLRT)
resultsNames(ddsLRT)
[1] "Intercept"    "FAB_M1_vs_M0" "FAB_M2_vs_M0" "FAB_M3_vs_M0" "FAB_M4_vs_M0" "FAB_M5_vs_M0"

Since I did this as I had quite a number of pairwise comparison which in fact I did but at the same I ran this LRT.

How should I interpret this here my reference level is set to M0 sub-type. If I have to find genes that are differential across all sub-type how should I go ahead?

Should I combine all the genes in each of these "FAB_M1_vs_M0" "FAB_M2_vs_M0" "FAB_M3_vs_M0" "FAB_M4_vs_M0" "FAB_M5_vs_M0"

I would like to get a suggestion

ADD REPLY

Login before adding your answer.

Traffic: 1946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6