Entering edit mode
5.2 years ago
NJE
▴
10
Hi,
I am trying to apply a machine-learning algorithm to alternative splicing data. My matrix size is 500 samples and 50,000 splicing isoforms. In some rows, I got "-1" values. From my understanding, the PSI value of zero means exon is never spliced in and the PSI value of -1 means there no information generated about splicing.
I would appreciate it if anybody can suggest me how should I handle "-1" values. Should I remove the whole row or just use that number of samples that do not contain "-1".
Please let me know!
Hi,
What software are you using to compute the PSI values in your samples? By definition, PSI values cannot be negative since PSI = IR/(ER + IR). However, if you are comparing two conditions (for example, X to Y) a deltaPSI value of -1 would mean that the event in question is present in 100% of isoforms in condition Y and none of the isoforms, for the specific gene, in condition X.
Some acronyms IR - reads supporting inclusion isoform ER - reads supporting exclusion isoform
Fjodor