Correlation in RNAseq
2
0
Entering edit mode
3.5 years ago
feng920308 • 0

How to analyze the Pearson correlation coefficient of mRNA abundance between two biological replicates? Can I use deeptools?

How to analyze the Pearson correlation coefficient of modification sites of nucleotide in mRNA? I have already got the modification sites, but I do not know how to analyze the correlation between samples.

RNA-Seq correlation • 3.1k views
ADD COMMENT
0
Entering edit mode

How to analyze the Pearson correlation coefficient of modification sites of nucleotide in mRNA?

What exactly do you wish to correlate? The number of modifications? The location? ...?

ADD REPLY
0
Entering edit mode

Sorry for your confusion. I want to correlation the location of modification sites of nucleotide in mRNA from different samples.

ADD REPLY
3
Entering edit mode
3.5 years ago

Here are a few thoughts:

  • "the Pearson correlation coefficient of mRNA abundance". I don't think this would be a good idea because mRNA abondance (that is, I guess, read count per gene) follows a long tail distribution that would skew the calculation the correlation. Instead, I would either use Spearman correlation on the read counts per gene, or Pearson correlation on (r)log-transformed read counts per genes.

  • "Can I use deeptools?" I think that it is possible to use deeptools for that purpose with the bamCorrelate and plotCorrelation functions. Another approach, perhaps more common, is to calculate the number of read per genes, feed that into DESeq/EdgeR, normalize and transform the data (with rlog transformation or instance), then calculate correlation.

  • "How to analyze the Pearson correlation coefficient of modification sites of nucleotide in mRNA?" Well, in that case I would not use a correlation because you do not have a continuous variable (it is just presence/absence from what I understand). Instead I would make a Venn diagram + FIsher's exact test to see if there is significant overlap between the modified sites in sample A vs B.

ADD COMMENT
0
Entering edit mode

Thanks a lot for your suggestions. I agree that it is better to use the read counts per gene rather than mRNA abundance to calculate the Pearson correlation. However, I am trying to reproduce a paper (https://doi.org/10.1016/j.molp.2018.01.008), in which I found that they analyze the Pearson correlation of mRNA abundance in Fig.1(B).

Now I understand that it is acceptable, but if they use the read counts per gene will be much better,

Besides, I have got continuous variables for my experiment. Can I calculate the Spearman correlation using cor() in R?

ADD REPLY
1
Entering edit mode

From the figure, I can tell that they calculated to correlation on the log of mRNA abundance, which makes more sense. I haven't read the paper in details but it is unclear to me what they call "mRNA abundance". From what I see, it could very well be read counts per gene.

Besides, I have got continuous variables for my experiment. Can I calculate the Spearman correlation using cor() in R?

Then, in theory, you can do that, yes.

ADD REPLY
0
Entering edit mode

Thanks a lot for your great help!

ADD REPLY
0
Entering edit mode

Please acknowledge the response that helped you solve the issue by upvoting and possibly bookmarking it.

ADD REPLY
2
Entering edit mode
3.5 years ago

How to analyze the Pearson correlation coefficient of mRNA abundance between two biological replicates? Can I use deeptools?

Yes

ADD COMMENT
0
Entering edit mode

Thanks a lot for your great help!

ADD REPLY

Login before adding your answer.

Traffic: 3017 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6