Entering edit mode
3.8 years ago
luzglongoria
▴
50
Hi there,
I am analysing some RNA expression from an experiment with a set up like:
sample condition
R1 A
R2 A
R3 A
C1 B
C2 B
C3 B
I have compared RNA expression between conditions It is mean : (R1,R2,R3) vs (C1,C2,C3) by doing:
library(DESeq2)
library(tidyverse)
#### Load data
library(readxl)
setwd("~/Documents/path/to/txt/file/")
data= read.table("Expression_level.txt", header = T)
View(data)
R1 R2 R3 C1 C2 C3
gene-CpipJ_CPIJ008101 484021 412077 445173 154707 148776 169263
gene-CpipJ_CPIJ001132 334997 391789 435968 445623 504466 445865
gene-CpipJ_CPIJ006209 326414 260289 301946 169859 149214 141446
gene-CpipJ_CPIJ002271 320207 282722 326901 203648 170398 134834
gene-CpipJ_CPIJ005941 316818 252593 273103 55266 43730 26304
gene-CpipJ_CPIJ009303 269236 357244 386633 426546 531801 483546
gene-CpipJ_CPIJ010326 233568 226659 254108 362953 278742 325969
gene-CpipJ_CPIJ008915 230936 276916 277624 355937 357974 239651
gene-CpipJ_CPIJ009571 223388 187980 207711 128457 139515 87437
annotation.info <- read.table("~/Documents//path/to/txt/file/",header = T)
Viewannotation.info)
sample condition
R1 A
R2 A
R3 A
C1 B
C2 B
C3 B
## Create Data Set
dds <- DESeqDataSetFromMatrix(countData = data,
colData = annotation.info,
design = ~ condition)
#do the analyses
dds <- DESeq(dds)
res <- results(dds)
res
## I keep only differentially expressed genes
subset(res,padj<0.05)->subset
summary(subset)
out of 7420 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 3751, 51%
LFC < 0 (down) : 3669, 49%
outliers [1] : 0, 0%
low counts [2] : 0, 0%
(mean count < 1)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results
BUT I want to know which of these up and down regulated genes belong to the condition A and B. I mean, how many genes are up and down regulated for each condition.
Is there any way of get this information?
Thank you so much
What you see there is the condition A compared to condition B, so the differentially expressed genes are the ones changing from condition B to A (DESeq2 selects the factors alphanumerically, meaning that your reference here will be A). You can check the comparison you have done by using the function
resultsNames()
thank you for your answer. If I do:
However, this doesn't show the genes that belong to each condition.
As I was telling you earlier, you are there comparing the condition B to the condition A. Therefore, the genes you see up/downregulated are the genes that change in B in relation to A. If they are upregulated, that means that those genes are more transcribed in B, whereas having genes downregulated means that they are more abundant in A condition.
Hope it helps,
Jordi
Thanks! Yes, it helps :)
You can check your
subset
table and see the signal in the log fold change. Genes with log foldchange > 0 are overexpressed in B. In addition, in the results function, you should add alpha = 0.05, since DESeq2 performs filtering to optimizing the number of genes which will have an adjusted p-value below a given FDR cutoff. Meaning that you have to specify your FDR cutoff prior to the statistical test. Read more about it here: