Question

RNA Normalization from DESeq to RPKM

0

Entering edit mode

9.5 years ago

alpha09 ▴ 10

I have RNA Seq DESeq normalized data.I want to convert it into RPKM.

kindly let me know what should I do?

Any R package etc.

Thank You.

rna-seq R next-gen • 4.5k views

ADD COMMENT • link 9.5 years ago by alpha09 ▴ 10

0

Entering edit mode

Total gene read counts were normalized on library size using DESeq method (size factor)

Feature_ID           M1_1              M1_2              M2_1              M2_2              M3_1              M3_2              M4_1              M4_2              M5_1              M5_2              M6_1              M6_2              M7_1              M7_2              M8_1              M8_2              M9_1              M9_2              M10_1             M10_2
AT1G01010            82.76             155.63            82.04             120.97            96.56             89.69             148.62            88.95             56.90             112.33            122.04            127.13            119.41            107.63            125.20            105.24            98.49             94.63             55.92             41.66
AT1G01020            287.06            233.44            232.45            326.93            261.60            194.49            478.16            424.96            108.23            241.24            296.39            327.82            405.37            459.73            333.87            300.69            318.63            356.01            539.58            682.54
AT1G01030            20.69             1.35              28.49             26.03             17.96             19.15             10.77             28.33             1.12              7.37              33.62             27.32             10.47             8.21              1.39              8.10              39.59             34.92             50.03             77.97
AT1G01040            1100.40           521.02            961.69            1202.83           898.21            1179.01           1620.81           1694.58           706.82            511.95            1414.69           1671.66           1864.49           1848.93           1295.15           1350.78           1352.74           1394.73           1789.44           1843.60
AT1G01046            0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00
AT1G01050            1705.56           2801.34           2171.78           1633.89           1871.64           1609.30           1285.88           1171.45           2309.58           2438.19           1463.26           1272.39           1210.87           1132.89           1499.65           1391.25           1760.20           1696.66           1372.49           1527.43

This is my file that I am using Devon. This A.Thaliana data. So simply I divide each counts with transcript length in kb then it will be converted to RPKM?

ADD REPLY • link updated 5.4 years ago by Ram 45k • written 9.5 years ago by alpha09 ▴ 10

0

Entering edit mode

Divide by a million too, that'll be the M part in RPKM.

ADD REPLY • link updated 5.4 years ago by Ram 45k • written 9.5 years ago by Devon Ryan 105k

0

Entering edit mode

Thank You Devon.

ADD REPLY • link 9.5 years ago by alpha09 ▴ 10

0

Entering edit mode

Thank You so much.

It helped me a lot

ADD REPLY • link 9.5 years ago by alpha09 ▴ 10

score 1 · Answer 1 · 2015-10-10

1

Entering edit mode

9.5 years ago

Devon Ryan 105k

Take the counts, divide them by the gene length in KB (you can probably download this, but if not just google for how to generate it from a GTF file) and then divide by the number of mapped reads in millions.

For what it's worth, edgeR provides an rpkm() function, though once again you'll need to supply the gene lengths.

ADD COMMENT • link 9.5 years ago by Devon Ryan 105k

0

Entering edit mode

Devon, In that case, the normalized read count would be the base mean number generated by the Deseq for each experimental/controle comparision?

ADD REPLY • link 9.5 years ago by tiago211287 ★ 1.5k

0

Entering edit mode

The normalized counts are per-sample.

ADD REPLY • link 9.5 years ago by Devon Ryan 105k

0

Entering edit mode

BTW, I should note that if you input normalized counts then you can just divide by a million rather than number of mapped reads in millions. It's best to not adjust for library-size differences twice...

ADD REPLY • link 9.5 years ago by Devon Ryan 105k

0

Entering edit mode

If you input DESeq normalized count, then it is not RPKM (Reads Per Kilobase of transcript per Million mapped reads) but something like "Reads Per Kilobase of transcript per Million mapped reads on exons". I'm not saying it is a bad metric, but don't call this RPKM to avoid confusion !

ADD REPLY • link 9.5 years ago by Carlo Yague 9.0k

0

Entering edit mode

I hate to break it to you but it's quite likely that most published RPKM values are calculated in this manner. I agree that a different term should probably be used, but that ship has already sailed.

ADD REPLY • link 9.5 years ago by Devon Ryan 105k

0

Entering edit mode

While this might be true, I don't think we should encourage use of inaccurate/imprecise terms... this just add to the general confusion with all the FPKM/RPKM/TPM/... things.

ADD REPLY • link 9.5 years ago by Carlo Yague 9.0k