Hi, I have 2 different normalised DEGs data set and I am trying to determine pearson correlation between these lncRNA and mRNA data sets. I tried cor.test function in R but it reads gene list at first column and I am taking "x value should be numeric " error.
I read files as following:
mRNA <- read.delim("mRNA.txt", header=TRUE, sep="", row.names=1)
lncRNA <- read.delim("lncRNA.txt", header=TRUE, sep="", row.names=1)
When I checked data first column are not included:
cor.test(mRNA,lncRNA, method = c("pearson"))
Please help me its urgent issue and I lost between all post.
A B
gene1 5.7 8
gene2 2 4
Convert your datasets to numeric before running
cor.test
. Also, how are you reading "delimited" datasets with no delimiter? Are these single-column files?Thank you for your reply. the 2 files are txt files including 36 column both of them but row numbers are different and it gives lenght of x and y must be equal error.
I converted as following:
Is there any way to say R to check gene names and perform pearson correlation between common genes? Or should I find common one and then use cor.test funct.?
If there are 36 columns, they must be separated by something. Why is your
sep
an empty string?Beacause when I checked txt files are not seperated anything I leaved it empty. Actually when I check the input data in R it looks okay.
How did you check this? Did you just eyeball it or did you use a program to verify this? How can there be multiple columns without a separator between columns?
As in, it shows multiple columns?
I just opened the tables in R and it shows multiple columns, all columns are seperated. I also checked with sep="\t" it is same.
I think in here my problem is gene numbers (lenght) of them.
lncRNAs have 2000 obs. and 36 variables mRNAs have 8000 obs. and 36 variables
You're going to have to pick a subset of the larger vector for the correlation, which is not a great idea. Are you looking for differential expression of one set of RNA vs another?
"You're going to have to pick a subset of the larger vector for the correlation, which is not a great idea." I think it is not a good idea.
I have DEs lncRNAs and DEGs (mRNAs) for same patients. And I want to perform the correlation analysis (Pearson correlation coefficient) for these mRNAs and lncRNAs.
Why do you want to correlate them? You cannot correlate vectors of different lengths.
which values are taken for pearson correlation?