Why does DESeq2 give negative log2 fold change in treatment vs control for some genes when normalized counts are higher in treatment? (Already checked my condition factors are correct)
1
0
Entering edit mode
5 months ago
mropri ▴ 160

Hi,

I have ran DESeq2 to get log2 fold changes of hypertrophy samples relative to control. (Added my code below). I think I am setting my contrast correct of Hypertrophy vs. control correctly while getting results. My log2 fold change results are mostly in concordance with what I observe in normalized counts. Meaning, for a gene, if the control samples have a lower mean than the hypertrophy samples, they show a positive log fold change and vice versa However, for some genes the results are conflicting. For example, The gene ATP5D is shown to have a negative log2 fold change of -0.48 but when I look at the normalized counts of the gene, the gene clearly has higher mean in the hypertrophy samples than the control (shown below). Or LSM7 has a negative log2 fold change of -0.5 while the normalized counts of Hypertrophy samples show a higher mean. On the other hand DSC3 has a negative log2 fold change of -2.13 and that can be observed in the normalized counts (shown below). I had asked this in the BioConductor forum but have not gotten any response. Would appreciate any insight into why this might occur from the community here. Very appreciative of any help. Thank you! I have also included a picture of my metadata.

enter image description here

# Create DESeq2 object        
  dds <- DESeqDataSetFromMatrix(counts, 
                                colData = md, 
                                design = ~PCNAscore + batch + group)


  # Run DESeq2 differential expression analysis
  dds <- DESeq(dds)

# Check the coefficients for the comparison
  resultsNames(dds)

  # Generate results object
  #res <- results(dds)
  res <- results(dds, contrast=c("group","HYPER","CTRL"), alpha = 0.05)

[1] "Intercept"              "PCNAscore"              "batch_batch2_vs_batch1" "batch_batch3_vs_batch1"
[5] "group_HYPER_vs_CTRL"

  # Shrink the log2 fold changes to be more appropriate using the apeglm method - should cite [paper]() when using this method
  res <- lfcShrink(dds, 
                   coef = "group_HYPER_vs_CTRL",
                   res=res,
                   type = "apeglm")
  res_tbl <- res %>%
    data.frame() %>%
    rownames_to_column(var = "gene") %>%
    as_tibble() %>%
    arrange(padj)

sessionInfo()

enter image description here

enter image description here

enter image description here

DESeq2 • 863 views
ADD COMMENT
2
Entering edit mode
5 months ago

Counts are not adjusted by the other factors considered in the model (i.e. PCNAscore and batch in your case), which is the most likely explanation.

If you want to generate corrected counts for visualization, you might check out ComBat-seq in the sva package.

Alternatively, your boxplots could be plotting counts for the wrong sets of samples, but we can't really judge that without code. As an aside, plotting counts on a log2 scale tends to be preferable.

ADD COMMENT
0
Entering edit mode

Thank you for your help. Will check the Combat seq corrected counts to see how it adjusts the box plots. I did check my box plots to make sure the correct samples were plotted for each category. Thank you again for your help.

ADD REPLY

Login before adding your answer.

Traffic: 2169 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6