Question

Clustering without having any replication

0

Entering edit mode

3.6 years ago

zizigolu ★ 4.3k

I have raw read counts of three different samples (sub-types) with no replications

like this

> head(oc)
          sample1 sample2 sample3
WASH7P          3      29      48
MIR6859-1       0       6       4
DDX11L17        0       2       2
WASH9P          3      92     101
MTND1P23        8     154     139
MTND2P28     3104    3491    3814
>

How I can normalise this data frame because both of edger and deseq2 software needs a design of conditions

My ultimate goal is having clustering of genes in these samples but for that I would need normalised counts

Thanks for any help

edger deseq2 • 904 views

ADD COMMENT • link updated 3.6 years ago by rodolfo.peacewalker ▴ 390 • written 3.6 years ago by zizigolu ★ 4.3k

score 2 · Accepted Answer · 2021-09-02

Hi there!

You can achieve your goal using edgeR. Here is a brief example of what you need to do:

Convert your raw count expression matrix into a DGEList object

DGEoc <- DGSEList(counts = oc, genes = rownames(oc), group = as.factor(colnames(oc)))

Calculate normalization factors:

DGEoc <- calcNormFactors(DGEoc, method = "TMM")

Retrieve cpm (TMM) normalized counts:

norm_counts <- cpm(DGEoc, log = T)

Then, you will be able to cluster the genes.

Best regards