Add RPKM values together for multiple different genes?
0
0
Entering edit mode
6.1 years ago
tpham2654 • 0

Is it OK to add RPKM values for multiple different genes together? The genes I want to add together are all subunits of the same protein (NF-κB) and I have different RPKM values for the different components I am interested in (REL, RELA, RELB, NFKB1, and NFKB2) as genes, NOT transcripts, separately.

I'm aware that you can use GSEA, but it seems kind of silly to do so for a gene set of 5 genes total.

RPKM RNA-Seq GSEA • 2.0k views
ADD COMMENT
1
Entering edit mode

You can't simply sum RPKMs. Before writing any more about that, though, please describe further what you actually want to do with such a value. Please note that RPKMs have very limited utility.

ADD REPLY
0
Entering edit mode

I am comparing two groups of mouse samples (3 per group, one group has a gene knocked out the other group does not) and seeing how the expression of NFKB changes due to the knockout.

I figured I could not just add RPKM's since I saw that in other questions online where people were talking about different transcripts. So what should I do?

ADD REPLY
1
Entering edit mode

Why don't you check if the genes that encode the subunits are differentially expressed with an appropriate framework like DESeq2?

ADD REPLY
0
Entering edit mode

I have differential analysis data for all the genes in my sample. The thing is I was requested to summarize the expression by somehow condensing all the NFKB data into 1 bar per group on a bar graph.

ADD REPLY
2
Entering edit mode

Your job isn't to give people what they ask for, it's to give them what they actually need whether they know it or not.

ADD REPLY
1
Entering edit mode

To add an explanation to this: To condense gene expression profiles of duplicated genes into a single value and present only this is inadequate, because duplication can lead to neofunctionalization and functional diversification and can also lead to diversification of gene regulation (e.g. Kleinjan et al. 2008).

NF-κB is a protein complex, expression and regulation of the components of protein complexes can also vary widely, and it is questionable if a mean or median expression over all genes is meaningful. I would rather look at pattern of co-expression e.g. by correlation and subsequent network analysis.

ADD REPLY
1
Entering edit mode

Using bar-plots to summarize data is a big no go since this can obscure the data - especially when you only have 3 replicates showing the raw data is essential for transparency.

You could do a point plot instead (same layout as a bar plot - you just have 3 points above one another instead of a bar)

ADD REPLY
0
Entering edit mode

Then better make two boxplots or beeswarm plot with labelled data points for the subunits.

ADD REPLY
0
Entering edit mode

I suggest you use GSEA, even for 5 genes, since that won't mask a lack of or discordant change in the components.

ADD REPLY

Login before adding your answer.

Traffic: 1935 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6