2
$\begingroup$

I am trying to implement a simple multivariate linear regression model without using any inbuilt machine libraries. So far, I have been able to get a root mean squared error for training about $2.93$ and the model from the normal (closed-form) equation is able to produce a training RMSE of $~2.3$. I am looking for ways in which I can improve my implementation of the gradient descent algorithm. Below is my implementation:

My gradient descent method looks like this: $\theta = \theta - [(\alpha/2N) * X (X\theta - Y)]$ where $\theta$ is the model parameter, $N$ is the number of training elements, $X$ is the input and $Y$ are the target elements. $\alpha$ is the step size.

def gradientDescent(self): for i in range(self.iters): # T = T - (\alpha/2N) * X*(XT - Y) self.theta = self.theta - (self.alpha/len(self.X)) * np.sum(self.X * (self.X @ self.theta.T - self.Y), axis=0) return errors 

I had set the $\alpha$ as $0.1$ and number of iterations as 1000. The gradient descent reaches convergence at around 700-800 iterations (checked).

My error function is like:

def error_function(self): # Error function: (1/2N) * (XT - Y)^2 where T is theta error_values = np.power(((self.X @ self.theta.T) - self.Y), 2) return np.sum(error_values)/(2 * len(self.X)) 

I was expecting the training error from the gradient descent and the normal equations would turn out to be similar, but they have a bit of a huge difference. So, I wanted to know whether I am doing anything wrong or not.

PS I have not normalized the data, yet. Normalizing leads to a much lower RMSE (~$0.22$)

$\endgroup$
3
  • 1
    $\begingroup$Shouldn't that be X transposed? $𝑋^{T}(π‘‹πœƒβˆ’π‘Œ)$.$\endgroup$
    – Elmex80s
    CommentedNov 18, 2020 at 21:40
  • $\begingroup$Yes, it would be $X^T(X\theta - Y)$. I have this habit of typing all vector equations in one go :P$\endgroup$
    – MaJoR21
    CommentedNov 19, 2020 at 12:09
  • $\begingroup$self.theta = self.theta - (self.alpha/len(self.X)) * np.sum(self.X * (self.X @ self.theta.T - self.Y), axis=0). Lot of errors in this line of code.$\endgroup$
    – Elmex80s
    CommentedNov 19, 2020 at 16:14

1 Answer 1

1
$\begingroup$

That could be due to many different reasons. The most important one is that your cost function might be stuck in local minima. To solve this issue, you can use a different learning rate or change your initialization for the coefficients.

There might be a problem in your code for updating weights or calculating the gradient.

However, I used both methods for a simple linear regression and got the same results as follows:

import numpy as np import matplotlib.pyplot as plt import seaborn as sns from sklearn.datasets import make_regression # generate regression dataset X, y = make_regression(n_samples=100, n_features=1, noise=30) def cost_MSE(y_true, y_pred): ''' Cost function ''' # Shape of the dataset n = y_true.shape[0] # Error error = y_true - y_pred # Cost mse = np.dot(error, error) / n return mse def cost_derivative(X, y_true, y_pred): ''' Compute the derivative of the loss function ''' # Shape of the dataset n = y_true.shape[0] # Error error = y_true - y_pred # Derivative der = -2 / n * np.dot(X, error) return der # Lets run an example X_new = np.concatenate((np.ones(X.shape), X), axis = 1) learning_rate = 0.1 X_new_T = X_new.T n_iters = 100 mse = [] #initialize the weight vector alpha = np.array([0, np.random.rand()]) for _ in range(n_iters): # Compute the predicted y y_pred = np.dot(X_new, alpha) # Compute the MSE mse.append(cost_MSE(y, y_pred)) # Compute the derivative der = cost_derivative(X_new_T, y, y_pred) # Update the weight alpha -= learning_rate * der alpha 

for the gradient descent the coefficients were:

array([-3.36575322, 28.06370831]) 

Here is the code for closed-form solution:

np.dot(np.linalg.inv(np.dot(X_new_T,X_new)), np.dot(X_new_T, y)) 

And the coefficients for the closed-form solution:

array([-3.36575322, 28.06370831]) 

As the coefficients are equal, the RMSE, MSE, R2 are equal.

$\endgroup$

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.