Logistic Regression

Implementing logistic regression with L2 Regularization from scratch to classify two circular datasets.

Since circular datasets are not linearly separable, it is necessary to map the feature space into higher dimensions. For instance, here is a feature mapping from 2 dimensions to 32 dimensions.

$X = [x_1,x_2]^T$

$f(X) = [x_1,x_2,x_1^2,x_1x_2,x_2^2,x_1^3,x_1^2x_2,x_1x_2^2,x_2^3,...,x_1x_2^6,x_2^7]^T; f: R^2 \rightarrow R^{35}$

Here is the implementation of logistic regression with L2 regularization built from scratch.

class LogisticRegression():
    def __init__(self, degree, learning_rate, iterations, Lambda):
        self.degree = degree
        self.learning_rate = learning_rate
        self.iterations = iterations
        self.Lambda = Lambda
        
    def transform(self, X):
        X_transformed = []
        x1 = X[:, 0].reshape(X.shape[0], 1)
        x2 = X[:, 1].reshape(X.shape[0], 1)
        for i in range(1, self.degree + 1):
            for j in range(0, i + 1):
                power_x1 = i - j
                power_x2 = j
                X_transformed.append((x1 ** power_x1) * (x2 ** power_x2))  
        return np.squeeze(np.array(X_transformed)).T
        
    def sigmoid(self, z):
        return 1 / (1 + np.exp(-z))
        
    def h_theta(self, X, theta):
        z = X.dot(theta)
        return self.sigmoid(z)

    def scale_features(self, X, mode='train'):
        if mode == 'train':
            self.mean = np.mean(X, axis = 0) 
            self.sd = np.std(X, axis = 0) 
        X_scaled = (X-self.mean)/self.sd
        return X_scaled
    
    def batch_gradient_descent(self):
        m = len(self.X_train)
        theta = np.zeros((self.X_train.shape[1], 1))
        for iteration in range(self.iterations):
            gradients = 1 / m * (self.X_train.T.dot(self.h_theta(self.X_train, theta) - self.y_train) + self.Lambda * theta)
            theta -= self.learning_rate * gradients
        return theta
    
    def fit(self, X_train, y_train):
        X_transformed = self.transform(X_train)
        X_scaled = self.scale_features(X_transformed)
        self.X_train = np.hstack((np.ones((X_scaled.shape[0], 1)), X_scaled))
        self.y_train = y_train
        self.theta = self.batch_gradient_descent()
        
    def predict(self, X_test):
        X_transformed = self.transform(X_test)
        X_scaled = self.scale_features(X_transformed, mode='test')
        X_test = np.hstack((np.ones((X_scaled.shape[0], 1)), X_scaled))
        return np.where(self.h_theta(X_test, self.theta) > 0.5, 1.0, 0.0)

First Case

The first dataset consists of 2 clusters of circular datapoints. The center of the first cluster is located at [1.5, 0] with a radius ranging from 4 to 9. The second cluster is centered at [1.5, 0] with a radius ranging from 0 to 6. Below is a scatter plot of the first dataset.

Here are the decision boundaries fitted for feature maps ranging from dimensions 1 to 9.

Degree 1	Degree 2	Degree 3

Degree 4	Degree 5	Degree 6

Degree 7	Degree 8	Degree 9

Here are the accuracy scores for different degrees of feature map dimensions.

Degree	1	2	3	4	5	6	7	8	9
Accuracy score	63.3%	77.5%	76.6%	76.6%	76.6%	76.6%	76.6%	77.5%	77.5%

Second Case

The second dataset consists of 2 clusters. The datapoints in the first cluster come from a normal distribution with a mean of [1, 0] and a standard deviation of 1. The second cluster contains circular datapoints centered at [1.5, 0] with a radius ranging from 2 to 6. Below is a scatter plot of the second dataset.

Here are the decision boundaries fitted for feature maps ranging from dimensions 1 to 9.

Degree 1	Degree 2	Degree 3

Degree 4	Degree 5	Degree 6

Degree 7	Degree 8	Degree 9

Here are the accuracy scores for different degrees of feature map dimensions.

Degree	1	2	3	4	5	6	7	8	9
Accuracy score	63.3%	77.5%	76.6%	76.6%	76.6%	76.6%	76.6%	77.5%	77.5%

Course Description

Course: Machine Learning [ECE 501]
Semester: Spring 2023
Institution: School of Electrical & Computer Engineering, College of Engineering, University of Tehran
Instructors: Dr. A. Dehaqani, Dr. Tavassolipour

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
doc		doc
Logistic_Regression.ipynb		Logistic_Regression.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Logistic Regression

First Case

Second Case

Course Description

About

Releases

Packages

Languages

fardinabbasi/Logistic_Regression

Folders and files

Latest commit

History

Repository files navigation

Logistic Regression

First Case

Second Case

Course Description

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages