Hi, I am working on RNA-Seq data for Arabidopsis plant. I have samples from 4 genotypes (WT, KO1, KO2, OE) with 3 replicates for each. I used STAR for alignment with reference genome and annotation file in gff3 format. Then I obtained count.txt files for each sample by using featureCounts. Now I am doing the DGE analysis by DESeq2 as:
countdata <- read.table("counts.txt", header=TRUE, row.names=1)
countdata <- countdata[ ,6:ncol(countdata)]
colnames(countdata) <- gsub("\\.[sb]am$", "", colnames(countdata))
countdata <- as.matrix(countdata)
head(countdata)
(condition <- factor(c(rep("WT", 3), rep("KO1", 3), rep("KO2", 3), rep("OE", 3))))
library(DESeq2)
(coldata <- data.frame(row.names=colnames(countdata), condition))
dds <- DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~condition)
dds
dds <- DESeq(dds)
I have following questions: 1. I used annotation file gff3 for alignment and for featureCounts, I converted gff3 to gtf format. Is it the right way or it would make any impact for DGE analysis? 2. I used SAM files in featureCounts. Is it okay or I should use BAM files? 3. In DESeq2 I am getting the results for each replicate individually. How can I merge the replicates for one genotype to compare with the other genotypes. I am not sure if the code I am using is right?
Any guidance from you will be very helpful.
Thank you
Thank you @i.sudbery. It worked. Its showing:
out of 25686 with nonzero total read count adjusted p-value < 0.05 LFC > 1.00 (up) : 627, 2.4% LFC < -1.00 (down) : 2473, 9.6% outliers [1] : 0, 0% low counts [2] : 2944, 11% (mean count < 1) [1] see 'cooksCutoff' argument of ?results [2] see 'independentFiltering' argument of ?results
Can you please tell me how can I extract the upregulated and downregulated genes?
the
deseq_results
object should have two columns: one for adjusted pvalues (I can't remember if its calledpadj
oradjustedPVal
off the top of my head) and one for log2FoldChange. The up genes will be those with adjustedPVal < 0.05 and log2FoldChange > 1, and the down genes will be those with adjustedPVal < 0,05 and log2FoldChange < -1.Thank you @i.sudberry. It was really helpful for me.