which data type I can use to bulid machine learning.

1

Entering edit mode

11 months ago

Qiang ▴ 10

Hi all,

I am new to the field of gene expression. I am currently working on testing different drugs on human liver cells.

After conducting RNA-seq analysis, I obtained three types of data: Reads count, FPKM, and log2 fold change (treated FPKM/untreated FPKM). I aim to build a machine learning classifier to establish relationships between gene names and the various drugs.

My question is, which of these data types—raw counts, FPKM, or log2 fold change—would be most suitable for building a machine learning model?

Thank you.

Machine-learning FPKM log2foldchange • 549 views

ADD COMMENT • link 10 months ago by Qiang ▴ 10

0

Entering edit mode

I wouldn't use raw counts since they haven't been normalized. You could potentially use either FPKM or log2 fold change. Why not make a separate model for each one and see which one does the best job of predicting the classes?

ADD REPLY • link 11 months ago by Jeremy ▴ 930

0

Entering edit mode

Thank you Jeremy! Yes. we will try to compare the results of ML model using FPKM or log2FC.

ADD REPLY • link 10 months ago by Qiang ▴ 10

Login before adding your answer.