Entering edit mode
11.0 years ago
xiaojuhu13
▴
150
I only get two samples without replicates for the DEseq analysis,but the results look unnormal,most FDR equal to 1.
> counts = read.table(file="48_50_1", header=T, row.names=1)
> my.design<-data.frame(row.names=colnames(counts),condition=c("L","H"))
> conds <- factor(my.design$condition)
> cds <- newCountDataSet( counts, conds )
> cds <- estimateSizeFactors( cds )
> sizeFactors( cds )
low high
0.9225312 1.0839742
> cds<-estimateDispersions(cds, method='blind',sharingMode='fit-only')
> cds<-nbinomTest(cds,"L","H")
> head(cds)
id baseMean baseMeanA baseMeanB foldChange log2FoldChange pval padj
1 23B 0 0 0 NaN NaN NA NA
2 5HT2A 0 0 0 NaN NaN NA NA
3 A1BG 0 0 0 NaN NaN NA NA
4 A1CF 0 0 0 NaN NaN NA NA
5 A2M 0 0 0 NaN NaN NA NA
6 A2ML1 0 0 0 NaN NaN NA NA
after trimming the 0 value, there are just 6 gene id padj are not equal to 1, the total nuber is 332 gene id.
As with your Edger Results Without Replicates, Fdr Looks Unnormal, why do you find this unusual. Without replicates, you have almost no power to detect anything.
yeah, after trimming pval=NA, only 332 were left.The total are more than 20,000 genes.
That alone seems a bit odd, I've never had a library only cover that few genes. You might look at the alignments to see if they're wonky.
The NA's you are showing you'll also see that your fold change values are NaN (Not a Number) and you're base means are 0. NaN values are when the software runs into either overflow or underflow errors because it is dealing with floating point numbers or doubles that are too large or too small for it to deal with. I forget exactly how many digits this corresponds to but it is a lot. In your case the suspicion would be severe underflow. Given the base means of zero I would assume those are all genes in which you simply have no read coverage.
I suspect something wonky is going on with your dataset as suggested. Also, of course there will be a power issue because of lack of replicates so you may not want to invest too much into the p-values, you'll just have lots of potential false positives in your dataset.