Hi, I am looking to perform Differential Gene expression analysis using R and GEO datasets. But I couldn't find the count matrices in the datasets, it is a raw data and need preprocessing. Can anyone help me how to use DESeq2 libary to preprocess the GEO dataset for Differential gene expression analysis.
Just to jump on this thread rather than make a new one, I am having a bit of an issue actually using the eset in DESeq2. I downloaded an expression set, transformed it into a summarizedexperiment object, but getting into a deseq object is giving me errors about the counts. Here is the input code:
> ddse <- DESeqDataSet(summ_exp, design = ~ title)
renaming the first element in assays to 'counts'
Error in DESeqDataSet(summ_exp, design = ~title) :
some values in assay are not integers
Use these and feed them into DESeq2 with DESeqDataSetFromMatrix, follow the manual for this.
You are still trying to use microarray functions (getGEO) for RNA-seq data. The RNA-seq data are not available by this function. Check the content of gse it has zero rows so no genes = no counts.
You have to download the fastq files, see e.g. Fast download of FASTQ files from the European Nucleotide Archive (ENA) and then follow e.g. this Bioconductor RNA-seq workflow https://www.bioconductor.org/packages/devel/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html
Which dataset?
GSE3821_series_matrix