Question

Select and keep only some results from DESEq2

0

Entering edit mode

3.7 years ago

luzglongoria ▴ 50

Hi there,

I am analysing some RNA expression from an experiment with a set up like:

sample     condition       
    R1         A                  
    R2         A                 
    R3         A                            
    C1         B                  
    C2         B                 
    C3         B

I have compared RNA expression between conditions It is mean : (R1,R2,R3) vs (C1,C2,C3) by doing:

library(DESeq2)
library(tidyverse)

#### Load data
library(readxl)

setwd("~/Documents/path/to/txt/file/")
data= read.table("Expression_level.txt", header = T)
View(data)
                          R1   R2     R3       C1     C2       C3
gene-CpipJ_CPIJ008101 484021 412077 445173  154707  148776  169263
gene-CpipJ_CPIJ001132 334997 391789 435968  445623  504466  445865
gene-CpipJ_CPIJ006209 326414 260289 301946  169859  149214  141446
gene-CpipJ_CPIJ002271 320207 282722 326901  203648  170398  134834
gene-CpipJ_CPIJ005941 316818 252593 273103   55266   43730   26304
gene-CpipJ_CPIJ009303 269236 357244 386633  426546  531801  483546
gene-CpipJ_CPIJ010326 233568 226659 254108  362953  278742  325969
gene-CpipJ_CPIJ008915 230936 276916 277624  355937  357974  239651
gene-CpipJ_CPIJ009571 223388 187980 207711  128457  139515   87437

annotation.info <- read.table("~/Documents//path/to/txt/file/",header = T)

  Viewannotation.info)
  sample     condition       
    R1         A                  
    R2         A                 
    R3         A                  
    C1         B                  
    C2         B                  
    C3         B               

## Create Data Set
dds <- DESeqDataSetFromMatrix(countData = data,
                              colData = annotation.info,
                              design = ~ condition)

#do the analyses
dds <- DESeq(dds)
res <- results(dds)
res

## I keep only differentially expressed genes
subset(res,padj<0.05)->subset
summary(subset)

out of 7420 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up)       : 3751, 51%
LFC < 0 (down)     : 3669, 49%
outliers [1]       : 0, 0%
low counts [2]     : 0, 0%
(mean count < 1)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

BUT I want to know which of these up and down regulated genes belong to the condition A and B. I mean, how many genes are up and down regulated for each condition.

Is there any way of get this information?

Thank you so much

RNA-Seq DESEq2 results R • 1.3k views

ADD COMMENT • link 3.7 years ago by luzglongoria ▴ 50

1

Entering edit mode

What you see there is the condition A compared to condition B, so the differentially expressed genes are the ones changing from condition B to A (DESeq2 selects the factors alphanumerically, meaning that your reference here will be A). You can check the comparison you have done by using the function resultsNames()

ADD REPLY • link 3.7 years ago by jordi.planells ▴ 480

0

Entering edit mode

thank you for your answer. If I do:

 resultsNames(dds)
[1] "Intercept"                         
[2] "condition_B_vs_A"

However, this doesn't show the genes that belong to each condition.

ADD REPLY • link 3.7 years ago by luzglongoria ▴ 50

0

Entering edit mode

As I was telling you earlier, you are there comparing the condition B to the condition A. Therefore, the genes you see up/downregulated are the genes that change in B in relation to A. If they are upregulated, that means that those genes are more transcribed in B, whereas having genes downregulated means that they are more abundant in A condition.
Hope it helps,
Jordi

ADD REPLY • link 3.7 years ago by jordi.planells ▴ 480

0

Entering edit mode

Thanks! Yes, it helps :)

ADD REPLY • link 3.7 years ago by luzglongoria ▴ 50

1

Entering edit mode

You can check your subset table and see the signal in the log fold change. Genes with log foldchange > 0 are overexpressed in B. In addition, in the results function, you should add alpha = 0.05, since DESeq2 performs filtering to optimizing the number of genes which will have an adjusted p-value below a given FDR cutoff. Meaning that you have to specify your FDR cutoff prior to the statistical test. Read more about it here:

ADD REPLY • link 3.7 years ago by Rogerio Ribeiro ▴ 110