Hello everyone, one little question on edgeR. I have the matrix counts file, a data.frame samples file, and the annotation file.
The manual says "The data.frame samples contains a column lib.size for the library size or sequencing depth for each sample. If not specified by the user, the library sizes will be computed from the column sums of the counts. For classic edgeR the data.frame samples must also contain a column group, identifying the group membership of each sample."
I tried by introducing a column for the library size like this
group lib.size
sample X 1 8094363
sample X 1 5005492
sample Y 2 7094693
sample Y 2 6094693
etc
so I do like this:
x <- read.delim("counts.txt", stringsAsFactors=FALSE)
group <- (c(1,1,2,2))
genes <- read.delim("genes.txt")
y <- DGEList(counts=x, group=group, genes=genes)
y <- calcNormFactors(y)
y$samples
but then edgeR ricalculates the library size putting a different number and introduce the normalization factor.
How and where to specify this library size in the correct way or avoid the replacement?
Thanks for you help
thanks! it works perfectly now