Comparing 2 different sets of RNAseq data - correlation
4
0
Entering edit mode
8.0 years ago
Genosa ▴ 160

Hi, sorry if this is a really basic question but I'm new to R and bioinformatics. After spending A LOT of time generating my RNAseq data from tuxedo suite, I need to compare 2 datasets generated by different methods. I am not sure what is the best method, but In papers I often see the usage of correlation scatter plots. Are there software packages or methods that can do this? My computer is basic (a laptop) and can crash when opens up too many large files.

Thank you for your advice and help

RNA-Seq • 9.4k views
ADD COMMENT
2
Entering edit mode
8.0 years ago
sysbiocoder ▴ 180

Initially make a table of FPKM values There are several packages available in R. I personally use "corrplot" Go through the below link, it explains the basics for creating correlation plot http://www.sthda.com/english/wiki/correlation-matrix-a-quick-start-guide-to-analyze-format-and-visualize-a-correlation-matrix-using-r-software

ADD COMMENT
0
Entering edit mode

Hi Sysbiocoder,

Thank you for the useful link. The metioned correlation methods (spearman vs. pearson), which would be the method of choice for comparing 2 database of different sequncing depths (i.e. I also have miseq vs. hiseq data from the same types of cells).

I have read online resources but generally says "there is no correct method" but which would make better sense for this sort of comparison?

Thanks

ADD REPLY
0
Entering edit mode

Spearman rank correlation is a non-parametric test for finding association two variables that are ordinal. Pearson r correlation for measuring the relationship between variables,In case of Pearson r correlation, both variables should be normally distributed. When sample size is large we assume the data is normally distributed based on central limit theorem. According to my knowledge you can use Pearson r correlation (Experts please correct me, if I am wrong)

But before going for correlation analysis, you have to normalize your data (Check Combat normalization method in R package) to remove bias due to sequence depth

ADD REPLY
0
Entering edit mode

Spearman rank correlation is a non-parametric test for finding association two variables that are ordinal. Pearson r correlation for measuring the relationship between variables,In case of Pearson r correlation, both variables should be normally distributed. When sample size is large we assume the data is normally distributed based on central limit theorem. According to my knowledge you can use Pearson r correlation (Experts please correct me, if I am wrong)

But before going for correlation analysis, you have to normalize your data (Check Combat normalization method in R package) to remove bias due to sequence depth

ADD REPLY
0
Entering edit mode
8.0 years ago
sysbiocoder ▴ 180

You can use R package for that

How is your RNA seq data? Is it Expression data?

ADD COMMENT
0
Entering edit mode
8.0 years ago
Genosa ▴ 160

Hi Sysbiocoder,

Yes, my data is available in Log2 FC or FPKM. May I know which R package I can use? Sorry I am new to R so I'll need some guidance on where to start.

ADD COMMENT
0
Entering edit mode
8.0 years ago
sysbiocoder ▴ 180

Check this package for normalization

http://bioconductor.org/packages/release/bioc/html/sva.html

For correlation analysis, there are many packages available, you can use corrplot https://cran.r-project.org/web/packages/corrplot/vignettes/corrplot-intro.html

ADD COMMENT

Login before adding your answer.

Traffic: 2111 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6