Skip to content

Logistic Regression Model

Ram Rathi edited this page Dec 23, 2018 · 2 revisions

Logistic Regression Model

So, for now, we've seen the theory behind logistic regression and why we need it. Next, we should look at what kind of a model we will have to build to implement it. As usual, we're going to be using only numpy and no other external ML libraries.

Prerequisites

Make sure you did the last 2 weeks of material. This doc heavily depends on linear regression and thus if you haven't done it, there isn't any point of reading this. Go back, do that first and then come here.

Steps Involved

  1. Get Data
  2. Multiply data with weights and add a bias (Lets call this Z)
  3. Pass Z through our activation function to get a
  4. Calculate the loss function, and adjust our weights.
  5. Repeat

You may realize that this is very similar to our last weeks task linear regression. The only added step is that of the activation function.

Thus, I'm already going to assume that you know how to split your dataset into the training and testing dataset, and have the data ready, present in 2 matrices X and Y.

Go ahead and set a learning rate, initialize your weights and bias, and basically set up the first few steps of linear regression.

Choosing our activation function

We'll be using the sigmoid activation function today. It's pretty easy to implement and is also referred to as the logistic activation function. The main reason why we use sigmoid function is that it exists between (0 to 1). Therefore, it is especially used for models where we have to predict the probability as an output. Since the probability of anything exists only between the range of 0 and 1, sigmoid is the right choice.

The function is differentiable. That means, we can find the slope of the sigmoid curve at any two points.
The function is monotonic but function’s derivative is not.
The logistic sigmoid function can cause a neural network to get stuck at the training time.

Its formula can be given as-

a = 1/(1+eZ)

The chain rule

Before we begin the optimiztion functions, lets rewind to the part in linear regression where we had to use diffrential equtions. If you didn't get the maths behind it, make sure to check out this link-
https://www.khanacademy.org/math/ap-calculus-ab/ab-differentiation-2-new/ab-3-1a/v/chain-rule-introduction
It'll be crucial to understanding how we get our optimization functions (and also crucial for passing the semester).
Alright, moving on.

Optmization functions

So we know how to get dW and db. We diffrentiate J with respect to W to get dW, and similary with b to get db.
Note- J is the loss function.

We will do the same thing here. But when you diffrentiate, you will realize that dZ is no longer a constant and is actually a represenation of a.
It is highly recommended that you solve the diffrential equation yourself, so that you get a better understanding of what I'm talking about.
Thus after solving for dz, we get-
dZ = A - Y

Note- This equation of dZ will only work if we are using the sigmoid activation function. For any other function we will again have to diffrentiate and get a new equation.

Now that we know the value of dZ, we can simply plug it into our normal linear regression formulas

dw = 1/m * np.matmul(X.T, dZ)
db = np.sum(dZ)

Thats it! Go ahead and train this model like you would a linear regression. This is literally all there is to logistc. Just a simple add on step. But the results you get vary greatly, and this is so much of a better model for actual real world applications.

Neurons

In a lot of literuate you may find things called nodes or neurons. These usally just refer to our logistic regression model. So one model, with one set of weights and one activation function is called a node/neuron.
When you mix up a bunch of them and kinda combine it, we get something called a neural network. This is way beyond the scope of this winter project, but its one of the most facinating things to come up in recent years, and gives amazingly accurate prediction results.
If you finish this winter project well on time, you may have a look at neural networks too.

What are you expected to do?

Go ahead and actually try implementing this on your own. I know, it looks kinda hard, especially if you didn't implement the linear regression one till now (In which case don't even bother reading this doc, go back)
We'll provide a demo to help you guys understand.

Make sure to watch-

https://www.youtube.com/watch?v=D8alok2P468
https://www.youtube.com/watch?v=NmjT1_nClzg [The whole series. If you manage to finish this series you'll be pro at regression]