Hello everyone
I have a question: I did transcriptome analysis on a cassava genotype harvested at 24 hr and 72 hr, compared with non-infested healthy plants. I had 3 biological replicates for each time-point, analysed with tophat/cufflinks/cuffdiff. The 3 biological replicates RNA-seq data from the cufflinks was used to test correlation on R and on excel between the replicates, however, I get a negative correlation, because there is a high difference in the FPKM values. Can you advise as to how I can calculate correlation between the replicates?
OR any other method?
Molemi.
If I understand correctly, you've calculated Pearson's correlation coefficient between replicates and you obtained negative values. Although one generally expects replicates to have high positive correlation, this is not always the case. Low correlation between replicates can be used as a quality control criterion e.g. to remove bad/failed replicates.
Have you looked at the data i.e. plotted replicates against each other ? This would make it easier to spot if the problem is caused by outliers. You may want to use raw counts for this.
Did you use rank correlation?