Hello Everyone, I am trying to perform differential expression. I started by importing the featurecounts data down to removing unwanted columns and performing the Matrix. After running the code for the DESeqDataSet, I got an error massage: "Warning message: In DESeqDataSet(se, design = design, ignoreRank) : all genes have equal values for all samples. will not be able to perform differential analysis"
I a kind of ignored it because I could not really understood the meaning. I went ahead to do sizefactors and i got a value of 1 for all the samples. Below is a view of what i did:
library(DESeq2)
library(Biobase)
#load the data file from featureCounts.txt
countData<- read.table("/home/mlsi/RNASeq/countTable/featureCounts.txt", head=T,row.names = 1)
#delete column 1-5
deleteColumnCountdata<- countData[-c(1,2,3,4:5)]
colnames(deleteColumnCountdata)
# romove .bam or .sam from filename
colnames(deleteColumnCountdata) <- gsub ("\\X.home.mlsi.RNASeq.mapping.","",colnames(deleteColumnCountdata))
colnames(deleteColumnCountdata) <- gsub ("\\.UHR_[123].bam","",colnames(deleteColumnCountdata))
colnames(deleteColumnCountdata) <- gsub ("\\.HBR_[123].bam","",colnames(deleteColumnCountdata))
colnames(deleteColumnCountdata)
class(deleteColumnCountdata)
head(deleteColumnCountdata)
# convert 'deleColumnCountdata' to matrix
newCountsData<-as.matrix(deleteColumnCountdata)
head(newCountsData)
group<- factor(c(rep("UHR",3), rep("HBR",3)))
con<- factor(c(rep("cancer",3), rep("normal",3)))
# contruct a data frame
countDataDataFrame<- data.frame(row.names = colnames(newCountsData), group , con)
head(countDataDataFrame)
#instantiate the DESeq dataset
dds<- DESeqDataSetFromMatrix(countData =newCountsData, colData =countDataDataFrame, design = ~ con)
"Warning message:
In DESeqDataSet(se, design = design, ignoreRank) :
all genes have equal values for all samples. will not be able to perform differential analysis"
dds<-estimateSizeFactors(dds)
sizeFactors(dds)
HBR_! HBr_2 HBR_3 UHR_1 UHR_2 UHR_3 all the samples have 1
is this correct? because I read that sizefactors are almost 1
then, I went ahead to do Pre-filtering: dds<- dds [rowSums(counts(dds)) > 1, ]
I tried rlogTransformation because I want to have a heatmap/clustering of the data:
rld<- rlogTransformation(dds)
I got this erroe:
Error in estimateDispersionsFit(object, fitType, quiet = TRUE) :
all gene-wise dispersion estimates are within 2 orders of magnitude
from the minimum value, and so the standard curve fitting techniques will not work.
One can instead use the gene-wise estimates as final estimates:
dds <- estimateDispersionsGeneEst(dds)
dispersions(dds) <- mcols(dds)$dispGeneEst
...then continue with testing using nbinomWaldTest or nbinomLRT
I also tried rlog, log but same error. I tried to follow the suggestion outlined in the error massage, but I dont think I will achieve my goal at the end.
I even tried DESeq(dds); i got this error: estimating size factors
estimating dispersions
Error in .local(object, ...) :
all genes have equal values for all samples. will not be able to perform differential analysis
Where am I getting wrong?
I will appreciate solutions and suggestions.
regards, Anthony
What is the output of
head(countData, 10)
?This error suggests that all values (counts) are the same all across the count matrix, this needs to be investigated before going any further.
There is a markdown code option (the
10101
button), please use it from now on to highlight code.@ ATpoint, Thanks for the response. Thats the output of head(countData)
Looks like featureCounts output. Now you have to find out whether all entries are indeed zero and if so why. Maybe the GTF you used is different from the reference genome you aligned against in terms of chromosome names?
@ ATpoint, i ran the below command on ubuntu to see the last 10 reads and as you can see all the entries are not zero
Also, after deleting the unwanted columns and converting to matrix, this the output:
If you want to check if there are any genes that don't have a count of 0 across all samples.
Hi rpolicastro, Thanks for your response. I tried to install the package "tidyverse" but the installation failed. I tried severally it failed to install. I was thinking I have old version of R maybe thats was the reason the installation failed but I have the new version. However, I was able to read the last 10 lines from ubuntu command line as you can see above