Entering edit mode
7 months ago
doramora
▴
10
Hello everybody!
I've got a problem connected with counting p-value in my experiment. I've got 8 RNAseq samples (and 4 repeatings for each sample), 4 samples - WT, 5-8 samples - with gene knock-down, so I've made a big file with all the countings and then I've started analizing it with help of DESeq (the R code is below). I did no normalization, because I've read, that DESeq do it automatically. And an amount of p-value == 0 or p-value <e-309 is crazy. *I've checked this genes in countings and they really differs a lot, like 90 to 800 reads. What could be wrong? I'm new to data analyzing.
Thank you!
dds <- DESeqDataSetFromMatrix(countData = cts_1,
colData = coldata,
design= ~ Cell_type)
dds <- DESeq(dds)
res <- results(dds, tidy=TRUE)
res <- as_tibble(res)`
Can you show the output of
summary(res)
? Possibly before you convert it to a tibble? We need to know what you mean bycrazy amount
.Also, please clarify if you are talking about p-values or adjusted p-values
I'm talking about both, I've added an image near to clarify.
I have 7666 genes and p-value of 5393 of them is <0,05, and for 2876 of them the p-value is lower then 0.000000001. I've counted -log10(FDR) for my data and It's between 0 and 300. It scares.
Looks like an in vitro experiment with cell lines, right? these can sometimes show thousands of DEGs because of lots of unspecific effects of the knock-down. Also, you say that you have "4 repeatings for each sample" are these like technical replicates? If so, it may be better to add them together or model them rather than consider them independently as you would with biological replicates.
Could you please tell the differences technical and biological replication? I've been provided with RNAseq data from 8 samples, each of which was sequenced 4 times to increase accuracy. Does this constitute technical replication? If so, would it be advisable to aggregate, calculate the mean, or employ another method for each set of replicates? *I apologize for any naive questions, I'm relatively new to this field and learning as I go. Additionally, if you're aware of any online courses or recommended reading materials, I would greatly appreciate it. Thus far, I've struggled to locate comprehensive information, instead piecing it together from fragmented sources.
Sometimes the line between biological and technical replicates is a bit fuzzy. In this case it seems you have technical replicates, because it is the same sample, resequenced. See this answer by the DESeq2 author in which he recommends adding them. I think there is a DESeq2 function for this (
collapseReplicates
) but I've not used it. Other very informative answers for handling replicates in RNA-seq: this one, and using other worflows which allow more complex modelling (limma) this other one.