Entering edit mode
5.1 years ago
srdjanmasirevic2
▴
10
Hello I am trying to calculate correlation coefficient, and I am trying to write a script but it gives me syntax error.
Basically I have some data and I want to see what is the correlation between these data I have.
But I am encountering some python syntax error that I cannot figure out how to fix it.
My code looks like this:
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = (20.0, 10.0)
#READING data
data = pd.read_csv ('benchmarking.csv')
print (data.shape)
data.head()
#Collecting X and Y
X = data['logAUC'].values
Y = data['RMSD'].values
#Mean X and Y
mean_x = np.mean(X)
mean_y = np.mean(Y)
print (mean_x, mean_y)
#Total number of values
n = len(X)
# Using the formula to calculate b1 and b2
numer = 0
denom = 0
for i in range(m):
numer += (X[i] - mean_x * (Y[i] - mean_y)
denom += (X[i] - mean_x) ** 2
b1 = numer/denom
b0 = mean_y - (b1 * mean_x)
print (b1, b0)
This is the error I get:
denom += (X[i] - mean_x) ** 2
^
SyntaxError: invalid syntax
My input data looks like this:
Protein name logAUC RMSD
0 Metaloellastase 47.96 0.61
1 FGF1 23.44 0.72
2 FKBP1A 38.98 1.16
3 UDP 15.45 0.58
4 MDM2 18.91 1.42
Your line starting
numer += ....
is missing a closing bracket, I think the error is just being misleading as its gone to the next line in search of the closing brace so it looks like the error is with thedenom...
line.