Question

In-sample and across samples normalized expression

1

Entering edit mode

5.2 years ago

Rajinder Gupta ▴ 10

I want to get the expression data that is in-sample normalized like FPKM and also across samples normalized as obtained using DESeq2 or else.

What I am currently doing is that I first normalize the data across samples (using DESeq) and from the resultant expression I calculate the FPKM. Does it make sense or am I missing something here?

RNA-Seq data normalization fpkm reads • 3.3k views

ADD COMMENT • link 5.2 years ago by Rajinder Gupta ▴ 10

1

Entering edit mode

The data have already been scaled for depth after the DESeq2 normalization, so FPKM is not suitable. Why do you want to do that?

ADD REPLY • link 5.2 years ago by ATpoint 85k

0

Entering edit mode

With DESeq the data is normalized across samples but I also want to compare different transcripts within the same sample which is not possible with DESeq normalization because of the different sizes of the transcripts. With FPKM this can be acheived but then how do I address the differences in sequencing depth across different samples.

ADD REPLY • link 5.2 years ago by Rajinder Gupta ▴ 10

0

Entering edit mode

I think as these are two different analysis goals, use TPM to compare your transcripts within the sample and use DESeq2 to compare samples. They are not easily interchangable. Is there an argument from your side not to do both separately?

ADD REPLY • link 5.2 years ago by ATpoint 85k

0

Entering edit mode

Unfortunately yes. I am developing a pipeline for analyses in which I have to use the results from within sample comparison for the across sample comparison

ADD REPLY • link 5.2 years ago by Rajinder Gupta ▴ 10

0

Entering edit mode

That is not a good idea. Please use google and the search function on why FPKM/TPM perform poorly for inter-sample comparison. There is a lot of literature out there on this plus the authors of the established differential analysis tools recommend explicitely against doing that. If you browse Bioconductor support page a bit you'll see why.

ADD REPLY • link 5.2 years ago by ATpoint 85k

0

Entering edit mode

I understand the limitation of use of FPKM for inter-sample comparison i.e. I am thinking of calculating the FPKM from the normalized read counts. What I am proposing is that the samples are first normalized using DESeq, edgeR or else and then from these normalized counts FPKM is calculated. So this is not the direct FPKM but an intersample normalized FPKM

ADD REPLY • link 5.2 years ago by Rajinder Gupta ▴ 10

0

Entering edit mode

Hi, Very late reply, but I only just came across this as well. NOIseq allows you to do TMM normalization (edgeR) and also account for gene lengths, therefore I believe this will give you between-sample and within-sample normalization.

Please correct me if wrong, as I am not an expert.

ADD REPLY • link 3.9 years ago by Morgan S. ▴ 90

0

Entering edit mode

You could simply divide the counts by gene length, I think that is not the general issue, but you decrease the counts and therefore lose power. Don't see the advantage.

ADD REPLY • link 3.9 years ago by ATpoint 85k

0

Entering edit mode

Ok, so bottom line, keep the within- and between-sample normalization separate. Thanks!

ADD REPLY • link 3.9 years ago by Morgan S. ▴ 90

0

Entering edit mode

You can also just normalize TPMs using DESeq size factors. I don't see anything wrong with doing so.

ADD REPLY • link 3.9 years ago by dsull ★ 6.9k