Question

Cuffnorm: which output file should I use

0

Entering edit mode

7.3 years ago

ha.hassanzadeh • 0

Hello Biostars,

I'm trying to use cuffnorm to normalize the cuffquant results. I'm getting these output files:

cds.count_tracking
cds.read_group_tracking
genes.fpkm_tracking
isoforms.count_tracking
isoforms.read_group_tracking
run.info
tss_groups.fpkm_tracking
cds.fpkm_tracking
genes.count_tracking
genes.read_group_tracking
isoforms.fpkm_tracking
read_groups.info
tss_groups.count_tracking
tss_groups.read_group_tracking

I'm mostly interested in genes, so which file should I use? is it the genes.count_tracking ? Is that the normalized gene counts?

Thanks

cufflinks cuffnorm next-gen RNA-Seq • 3.8k views

ADD COMMENT • link updated 7.3 years ago by aka001 ▴ 190 • written 7.3 years ago by ha.hassanzadeh • 0

0

Entering edit mode

What is your question, i.e. what will you do for the next step? Is it differential expression of the genes, or?

ADD REPLY • link 7.3 years ago by aka001 ▴ 190

0

Entering edit mode

Not the DE analysis, my goal is to predict a clinical end-point, such as survival using the normalized expression levels.

ADD REPLY • link 7.3 years ago by ha.hassanzadeh • 0

score 0 · Answer 1 · 2017-08-22

0

Entering edit mode

7.3 years ago

Prakash ★ 2.2k

genes.fpkm_tracking is the file for genes with its normalized FPKM (fragment per kb per millions of mapped reads)

ADD COMMENT • link 7.3 years ago by Prakash ★ 2.2k

0

Entering edit mode

You mean they are not the normalized values by cuffnorm? I'm confused.

ADD REPLY • link 7.3 years ago by ha.hassanzadeh • 0

score 0 · Answer 2 · 2017-08-22

0

Entering edit mode

7.3 years ago

aka001 ▴ 190

Although I would presume that you did cuffnorm because you wanted to normalise your data, I would still just want to say that it is important to see what are your biological questions. "Mostly interested in genes" can mean many things and if you intend to do differential expression of the genes (with the mainstream packages, like DESeq or edgeR) for the next step, then you would actually want to take the genes.count_tracking file, as it is not normalised and hence it would suit for DE purpose.

ADD COMMENT • link 7.3 years ago by aka001 ▴ 190

0

Entering edit mode

I'm not doing DE analysis. my goal is to predict a clinical end-point, such as survival using the normalized expression levels. If genes.count is not the normalized values then which file have the normalized expression levels according to cuffnorm?

ADD REPLY • link 7.3 years ago by ha.hassanzadeh • 0

0

Entering edit mode

The normalised expression should be genes.fpkm_tracking.

ADD REPLY • link 7.3 years ago by aka001 ▴ 190

0

Entering edit mode

Thanks, but my understanding was that fpkm is very simple to because it is just done by a few multiplicaitons and divisions. Which should not take hours to be completed by cuffnorm, what am I missing then? Thanks again

ADD REPLY • link 7.3 years ago by ha.hassanzadeh • 0

0

Entering edit mode

Well, if we go back again to what you wanted in the first place, which is normalisation, it would depend of what kind of normalisation did you want. By definition, fpkm is already normalised value, but it's normalised by library size (i.e. within sample). For the next step, probably you would want to normalise across samples (with methods like RLE, TMM, etc., which I am not sure cuffnorm is doing that) from the fpkm values (or TPM if you want) from the cuffnorm output files. For the speed, I would just guess that cuffnorm is not only calculating that, but much more other things (as reflected by the number of files generated).

ADD REPLY • link 7.3 years ago by aka001 ▴ 190

0

Entering edit mode

Thank you so much, this was helpful. My understanding from what you said is that the gene_tracking is not normalized (with respect to the library size), but when I check the raw gene counts vs the gene_count_tracking, the latter has been scaled down, which means it is doing some normalization as well. Am I missing something?

ADD REPLY • link 7.2 years ago by ha.hassanzadeh • 0