Hi there,
Fairly new to this area so will try to explain as clearly as possible. I have two RNA-Seq data sets - one corresponds to a series of cancer cell lines, the other to cell lines we are using as a model of 'normal' epithelium. The expression units of one data set are in FPKM the other RPKM. I think I superficially understand the difference between these units, in that one is used in mapping transcripts in single end sequencing the other paired end sequencing.
My question is are these units directly comparable? The analysis I wish to carry out is fairly straight forward - I have a predefined list of around 500 genes, and simply want to compare differences in expression between the non-cancer/cancer background, in terms of which of these genes they are expressing at all as well as the relative expression levels. For which transcripts are expressed I had intended to use any value over 0 (FPKM or RPKM) as denoting expression of a transcript, but am unsure if I can compare relative abundance.
I should add that I only have access to the raw data of one dataset, the other is as a results table sent by collaborators.
For reference, see Wagner et al 2012, as well as this blog post from Harold Pimentel. Both are very useful clarifications, and have specific interest for comparing between datasets.