RPKM for multi-copy genes
0
0
Entering edit mode
4.4 years ago
poltora4enko ▴ 10

Hello all,

I am trying to calculate the RPKM for multi-copy genes and am looking for a formula for this. My genes rDNA is multi-copy, and i need to normalize my RPKM on copies of my rDNA genes. Will you so kind to help me. Thanks in advance Valentin!

RPKM Multi-copy genes • 1.2k views
ADD COMMENT
0
Entering edit mode

before we really dig into this:

you correctly say it's rDNA you're investigating?

two points of attention to start: RPKM values are not really commonly used anymore (will not go into details, you should google around for reasons why) better use CPM or TPM normalised values . Second point, as far as I know, unless you specifically ask for it rDNA (actually the rRNA) are typically removed before sequencing, as they tend to take up way too much sequence reads from the sequence pool.

ADD REPLY
1
Entering edit mode

Yes, the aim of the study is exactly the Drosophila Melanogaster rDNA cluster. As far as I know, in our case, the rDNA was not deleted. Ok i'll keep in mind the use of TPM. I apologize for bad English if I do not correctly explain my thoughts!

ADD REPLY
0
Entering edit mode

OK, thanks for confirming all this. (and no need to apologise :)

on topic: analysing multicopy genes (especially the very similar ones as the rDNA ones) is very tricky. An additional thing you're likely gonna have to deal with is the fact that the rDNA will be super over-present in your sample. I'm no expert on that specifically but I can imagine it might play a role.

ADD REPLY
0
Entering edit mode

Thank you so much for your reply!

ADD REPLY
0
Entering edit mode

Why do you think you need to normalize to rDNA copies?

ADD REPLY
0
Entering edit mode

I need to compare the RPKM of the rDNA and the RPKM of the protein coding genes, but the protein coding genes with which I compare the monocopy genes in contrast to the rDNA genes, so I need to normalize

ADD REPLY
0
Entering edit mode

Why do you need to compare rDNA and protein coding gene expression levels? We're asking about the biological background because you're asking to do such an incredibly unusual thing. Did your reads include UMIs? What sort of read count do you have for rRNA and, importantly in this application, what platform was the sequencing performed on (presuming an Illumina machine, we need the exact type)? There are a wide number of factors that could cause problems with this and I strongly suspect there will be a better way to approach things.

ADD REPLY
0
Entering edit mode

ILLUMINA (NextSeq 550). data - PROseq PAIRED. Source: TRANSCRIPTOMIC

I have Drosophila rDNA genes, some of which are inserted with transposons and some of them are inactivated by RNA interference, so I got the task to see how the average level of transcription of rDNA repeat and R1 / R2 transcription (these are transposons) correlates with the levels of transcription of protein-coding genes

ADD REPLY

Login before adding your answer.

Traffic: 2386 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6