Hi,
I have been provided 30 samples (15 pairs). Each pair has one cancer and one normal sample. They are paired-end sequencing data for 15 cancer and 15 normal samples. I am doing RNA-seq analysis for these samples using DESeq package. I used HTSeq_count to count the number of reads for each gene.
Group1 - Sample1 is cancer and Sample2 is normal
Group2 - Sample3 is cancer and Sample4 is normal
Group3 - Sample5 is cancer and Sample6 is normal
... ...
Group15- Sample29 is cancer and Sample30 is normal
The technician asked me to compare each group (1cancer and 1normal) individually. So, I have to perform 15 pairwise comparisons.
1) I merged the htseq_count of cancer and normal samples into single txt file.
2) I used following steps in DESeq to plot differential gene expression
countsTable <- read.csv(file="Group1_reads_count.txt",header=TRUE, row.names=1,sep=",")
my design <- data.frame(row.names=colnames(countsTable),condition=c("Can","Nor"))
conds <- factor(mydesign$condition)
cds <- newCountDataSet(countsTable, bonds)
cds.norm <- estimateSizeFactors(cds)
sizeFactors(cds.norm)
Sample1 Sample2
1.34715 0.74230
res <- nbinomTest(cds.norm,"Can","Nor")
colnames(res)
[1] "id" "baseMean" "baseMeanA" "baseMeanB"
[5] "foldchange" "log2foldchange" "p-val" "padj"
Questions:
1) For this scenario that is without replicates, pairwise comparison of 1 cancer and 1 normal. Do I need to consider log_fold_change or p-value or adjusted p-value?
2) I have 3 cases in the above image,
- case A looks significant based on log_fold_change and p-value
- case B is not significant based on log_fold_change, p-value and adjusted p-value
- case C looks significant based on p-value, but one of the sample's baseMean is "0".
Do I need to ignore such genes or consider them as significant genes?
If you are using DESeq R package, I would advice you to use DESeq2 (http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html).
Thanks, now I am using DESeq2 package.
There is no such thing as "looks significant". Depending on a pre-set cut off on the adjusted p-value your results are either significant or aren't. Furthermore, size of log fold change doesn't say anything about significance.
Thanks, as I don't have replicates. Instead of p-value, I am planning to consider log fold change value for my analysis.