Hi,
Cancer genomics is not my area, but maybe this will give you a road map with which to start.
In any comparison of condition 1 v condition 2, I use deseq2. If you go through the workflow first and then the manual you should be able to do what you need. When going through these, pay particular attention to anything related to design, contrast, and interactions.
Design:
# from the workflow:
ddsMat <- DESeqDataSetFromMatrix(countData = countdata, # counts file
colData = coldata, # metadata file
design = ~ cell + dex) # "dex" is condition of interest, "cell" is another variable (i.e. samples batches)
# To begin I suggest a simple design of something like this (assuming cancer status is in a column named "condition":
design = ~ condition + location + grade
# More advanced:
# concatenate age_sex into one column with meaningful groups
# add to design with "+ age_sex" as a variable of interest to allow you to compare values within age_sex column (i.e. groups)
# add age and sex as interactions with cancer status as they are confounders (i.e. age and sex both affect gene expression regardless of cancer status)
# design = ~ condition + location + grade + age_sex + condition:age + condition:sex
Contrast:
Use the contrast parameter of results() to compare 2 values in any 1 variable column of the metadata. For the more complex design formulas above (especially interactions) I use this guide to designs and contrasts in deseq2.
dds <- DESeq(dds)
res <- results(dds, contrast=c("dex","trt","untrt")) # from workflow - compares treated v untreated with dexamethasone
res_basic <- results(dds, contrast=c("condition","cancer","healthy"))
Hope this helps.
P.S. With concatenation of variables like age_sex in the metadata, creating a heatmap/PCA of the normalised counts (with vst or rlog from deseq2) would also be good visualisations to see overall patterns.