RNA-seq data for deep learning classification
1
0
Entering edit mode
6 months ago
yahn • 0

The objective of my task is binary classification for the HPV-status of head and neck cancer patients with multi-modal data including genomics, transcriptomics and histopathology data.

For the transcriptomics data, I downloaded mRNA seq files with raw count, and normalised counts with tpm, fpkm, fpkm-uq transformation methods. However, it seems that these normalisation methods are not preferred choices and often raw counts are used directly as inputs for DESeq2 or EdgeR normalisation.

I did some further reading on DESeq2 and EdgeR normalisation methods, but they use the label information - which I would not want as this would require an independent test set etc..

Prior to the feature selection, I would like to apply normalisation for the raw count but I am not very sure still after days of reading which format of RNA-seq data to use. Could anyone give advice on how I can proceed further with this?

Thank you very much.

rna-seq • 481 views
ADD COMMENT
1
Entering edit mode
6 months ago
dsull ★ 6.9k

Honestly speaking, if it's deep learning, it probably doesn't matter that much if you use something like TPMs (probably not raw counts, unless one of your features is sequencing depth). I'm sure a deep learning model will be able to learn the things that cause between-sample differences and account for them naturally as it's making predictions.

Machine learning is practically constructing a complicated mathematical function over your features. Normalization is itself a mathematical function.

A patient walks into your clinic and you want to tell that n=1 patient their HPV status. You can get TPMs out from their sample fairly easily, and that's what you want to plug into your model.

ADD COMMENT
0
Entering edit mode

Thank you very much for sharing your advice. Yes, it definitely makes sense that with the use of deep learning, models would learn normalisation itself present in data. This reply saved a lot of headaches and time! Thank you :)

ADD REPLY

Login before adding your answer.

Traffic: 2446 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6