Question

Using Ngs, What Is The Best Mirna Expression Normalization Method?

5

Entering edit mode

14.4 years ago

Doctoroots ▴ 810

Hi All. i need to check miRNA differential expression between two illumina GAII run lanes. for this, i summed up all the reads that aligned to each miRNA. and i now have 2 tables that specify each miRNA and the number of reads that aligned to it.

obviously a normalization needs to be done when comparing these two tables. my question is: normalization to what? should i use the total number of reads (mapped and unmapped) produced from each lane? or should i use the number of mapped reads? or the number of uniquely mapped reads?

im leaning towards the first option (normalizing the hit number in accordance to the total number of reads produced in each lane)

mirna gene data next-gen sequencing • 7.5k views

ADD COMMENT • link updated 14.1 years ago by Brad Chapman 9.7k • written 14.4 years ago by Doctoroots ▴ 810

Ram · Answer 1 · 2010-11-25

3

Entering edit mode

14.4 years ago

Michael 55k

There was a related answer here: What Metrics Are Best To Describe The "Coverage" Of Rna-Seq Data?

The paper I mentioned there compares some normalization methods for RNA-seq. If there is some special treatment required for miRNA other than mRNA seq I cannot say, I think that is controversial at best at the moment. Amplification and depletion steps in the protocol might also affect the validity of a quantitative interpretation of your results, but again, that is be debatable.

You could start with comparing RPKM and upper quartile normalization, however it is also not clear how to assess which method performs to satisfaction given new data, so another problem here. Try to find some candidate regions which you already know should not be differential and then compare the normalized scores on these regions.

ADD COMMENT • link updated 5.6 years ago by Ram 45k • written 14.4 years ago by Michael 55k

0

Entering edit mode

This article I found also supports using rpm but it is mentioned that RNA extraction methods alter this normalization results and may lower their validity

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 14.4 years ago by Doctoroots ▴ 810

0

Entering edit mode

Exactly. That's what others told me about bias. The bias issue always seems to come up, but I haven't really seen a good suggestion how to deal with it.

A agree that the definition of rpm is more appropriate for the short miRNA than RPKM, which refers to 'kilobase of exon model' originally. There is possibly no gene-length normalization/scaling required for these short sequences.

ADD REPLY • link 14.4 years ago by Michael 55k

score 2 · Answer 2 · 2010-11-25

There are a couple of differential expression analysis toolkits available in Bioconductor:

edgeR
DESeq

The edgeR vignette documentation contains a detailed discussion of normalization issues.

A recent thread at SeqAnswers on miRNA analysis has additional links to normalization papers, as well as other resources which might be useful for your analysis.