one condition

Question

Generating DE genes of a time series

0

Entering edit mode

2.4 years ago

Nal23 ▴ 10

Hi, I'm trying to follow the following tutorial (https://scienceparkstudygroup.github.io/rna-seq-lesson/08-cluster-analysis/index.html) to generate the differentially expressed genes in a developmental timeline and perform gene cluster analysis but I'm struggling with editing the script for my data and was hoping someone might be able to help me. I'm struggling with creating the list of DE genes and the normalized counts for these DE genes with the loop used (section 2.2) and was wondering if this could be done in a different way - more manually by generating the DE gene lists of each of my interested combinations. Any help will be much appreciated.

DESeq RNASeq • 872 views

ADD COMMENT • link 2.4 years ago by Nal23 ▴ 10

0

Entering edit mode

Sorry I wasn't clear about the problem. I'm unclear how to modify the code to loop through the different combinations of different expression analyses using the loop. I've copied the code that the tutorial used below. I don't have any columns to discard so I modified that. I also want to compare my samples by time so that would be my condition (I modified the "conditions" in the script with "Time"). I wasn't sure if I had to modify anything else in the script - I don't have much experience with loops so wanted to see if anything might be obvious to someone else. I've pasted the error I get at the end.

normalize <- function(countdata, xp_design, f, p){ col.names = colnames(xp_design) # extract all column names columns.to.discard = c("sample","fq1","fq2","fq") # column we don't need to extract all conditions colsForConditions = col.names[! col.names %in% columns.to.discard] # only keeping the column of interest

one condition

if (length(colsForConditions) == 1){ condition <- factor(xp_design[,colsForConditions])

# two conditions

} else if (length(colsForConditions) == 2){

# two conditions --> make a third column that is the product of the two
xp_design$conditions = paste(xp_design[,colsForConditions[1]],xp_design[,colsForConditions[2]],sep = ".")
condition <- factor(x = xp_design[,"conditions"],levels = unique(xp_design[,"conditions"]))

} else if (length(colsForConditions) == 3){ xp_design$conditions = paste(xp_design[,colsForConditions[1]],xp_design[,colsForConditions[2]],xp_design[,colsForConditions[3]],sep = ".") condition <- factor(x = xp_design[,"conditions"],levels = unique(xp_design[,"conditions"])) }

Analysis with DESeq2

Create a coldata frame and instantiate the DESeqDataSet. See ?DESeqDataSetFromMatrix

coldata <- data.frame(row.names=colnames(countdata), condition) dds <- DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~condition)

Run the DESeq pipeline

dds <- DESeq(dds)

create dataframe containing normalized counts, to which the differential expression values will be added

resdata <- as.data.frame(counts(dds, normalized=TRUE))

iterate through the list of conditions to create differential expression (DE) values for all possible pairs

total_selec <- list() x <- 1 for(i in levels(condition)){ x <- x + 1 if (x <= length(levels(condition))){ for(j in x:length(levels(condition))){ res <- results(dds, contrast=c("condition", i, levels(condition)[j])) # i = first in pair, levels(condition)[j] is the second in pair. d <- paste(i, levels(condition)[j], sep="&") # paste the two conditions in one character, to be used as the pair name res$genenames <- rownames(res) resul <- as.data.frame(res) significantDE <- resul %>% filter(padj<p & (log2FoldChange>f | log2FoldChange<(-1*f)) ) selec <- as.list(significantDE$genenames) total_selec <- append(total_selec, selec) } } } total_selec <- c(unique(total_selec)) total_selec <- t(as.data.frame(total_selec)) selection <- resdata[total_selec[,1],] return(selection) }

DEgenes <- normalize(counts,xp_design,2,0.05)

Error in data.frame(row.names = colnames(countdata), condition) : row names supplied are of the wrong length 3. stop("row names supplied are of the wrong length") 2. data.frame(row.names = colnames(countdata), condition) 1. normalize(counts, xp_design, 2, 0.05)

ADD REPLY • link 2.4 years ago by Nal23 ▴ 10

score 1 · Answer 1 · 2022-09-27

1

Entering edit mode

2.4 years ago

swbarnes2 14k

I'd forget the loop. Lay out explicitly what you are comparing to what, and what you are naming those results. You can get cute with loops later.

ADD COMMENT • link 2.4 years ago by swbarnes2 14k