Hi, sorry if this is a really basic question but I'm new to R and bioinformatics. After spending A LOT of time generating my RNAseq data from tuxedo suite, I need to compare 2 datasets generated by different methods. I am not sure what is the best method, but In papers I often see the usage of correlation scatter plots. Are there software packages or methods that can do this? My computer is basic (a laptop) and can crash when opens up too many large files.
Thank you for your advice and help
Hi Sysbiocoder,
Thank you for the useful link. The metioned correlation methods (spearman vs. pearson), which would be the method of choice for comparing 2 database of different sequncing depths (i.e. I also have miseq vs. hiseq data from the same types of cells).
I have read online resources but generally says "there is no correct method" but which would make better sense for this sort of comparison?
Thanks
Spearman rank correlation is a non-parametric test for finding association two variables that are ordinal. Pearson r correlation for measuring the relationship between variables,In case of Pearson r correlation, both variables should be normally distributed. When sample size is large we assume the data is normally distributed based on central limit theorem. According to my knowledge you can use Pearson r correlation (Experts please correct me, if I am wrong)
But before going for correlation analysis, you have to normalize your data (Check Combat normalization method in R package) to remove bias due to sequence depth
Spearman rank correlation is a non-parametric test for finding association two variables that are ordinal. Pearson r correlation for measuring the relationship between variables,In case of Pearson r correlation, both variables should be normally distributed. When sample size is large we assume the data is normally distributed based on central limit theorem. According to my knowledge you can use Pearson r correlation (Experts please correct me, if I am wrong)
But before going for correlation analysis, you have to normalize your data (Check Combat normalization method in R package) to remove bias due to sequence depth