Entering edit mode
6.6 years ago
saamar.rajput
▴
80
I have read counts for RNAseq data for human and bacteria with three replicates each. I have transformed the absolute read counts to Z-scores and now I want to do correlation analysis between the Z-scores of the human genes and the bacterial genes.
My data looks like this
head(Human)
R1 R2 R3
ENSG00000000003.12 -0.005805397 -0.004682964 -0.004462534
ENSG00000000005.5 -0.010862624 -0.008231991 -0.009366348
head(Bacteria)
R1 R2 R3
BP0001 0.4561311 0.317043968 0.345781788
BP0002 -0.1443091 -0.158057523 -0.073588398
BP0003 0.2459770 -0.004348217 -0.013678371
I want to correlate every human gene with every bacterial gene. I used the following command
ct = cor.test(Human, Bacteria, method = "pearson")
I have two questions, Firstly, I am not sure if this would work as I want or not. Secondly before that I have an error, since the two dataframes are not equal.
Error in cor.test.default(Human, Bacteria, method = "pearson") :
'x' and 'y' must have the same length
Can somebody tell me how to use the cor.test function with unequal dataframes.
Curious as to what kind of an experiment is this?
I am trying to correlate every single human gene with every single bacterial gene by pearson correlation using the the read counts of the genes.
When you correlate, you have to have the same number observations in the vectors that you are correlating. You cannot just randomly correlate any data-matrices.
The only way that you can do this is by correlating each row of Human with each row of Bacteria, but then you only have 3 observations going into each correlation test and the statistics will be virtually meaningless and not credible.
Please try to come up with a better way of comparing human and bacterial genes by doing a literature search on the topic. Your analysis would be better if you had more observations.
Can you confirm biological logic behind this comparison? Something that is happening in bacteria correlates to human gene expression (or vice-versa)?
To identify bacterial and human genes with similar expression kinetics
You can combine the two data frame using
human.bacteria <- rbind(Human, Bacteria)
than do correlation analysis.