Logistic Regression (Classification)— Mathematical intuition

Rohan Paris
5 min readMar 28, 2022

--

image credit: sefiks.com

Table of contents:
1. Need for Logistic Regression and math behind it
2. Logistic Regression for multiclass classification
3. Python library to implement Logistic Regression
4. References

Classification is a predictive modeling problem that involves assigning a class label to an example. The classification problem with two classes is known as Binary Classification and the problem with more than two classes is known as Multi-class Classification.

1. Need for Logistic Regression and math behind it

For the above values of x and y, we can easily plot the best fit line using Linear Regression. We can classify the data points to the left side of the line as Class1 and the data points to the right side of the line as Class2. This best fit line is also known as Decision Boundary and it is a property of θ and not of the training set.

Please refer to the following link for Linear Regression and its mathematical intuition:- https://parisrohan.medium.com/simple-linear-regression-mathematical-intuition-and-python-implementation-f8ca5c17a4c5

The problem arises when we have an outlier and our best fit line changes according to it.

According to the above plot, some of the data points which were classified as Class2 are now being classified as Class1. Wrongly classifying data might be fatal in situations like when we are trying to identify whether a tumor is malignant or benign. So we need an algorithm to classify the data in a much better way.

Logistic Regression is a Machine Learning classification algorithm that is used to predict the probability of a categorical dependent variable. In logistic regression, the dependent variable is a binary variable that contains data coded as 1 (yes, success, etc.) or 0 (no, failure, etc.).

The hypothesis function is defined as below for Linear Regression

The above hypothesis function is modified for Logistic Regression

The above function is known as the Sigmoid function or Logistic function. It is graphically represented as follows

sigmoid function ‘g(z)’

From the above function, it can be observed that

i) Suppose y=1, when g(z) ≥ 0.5 then z ≥ 0

ii) Suppose y=0, when g(z) < 0.5 then z < 0

1.1 Cost function for Logistic Regression

By substituting the new hypothesis function into the cost function of Linear Regression, we get a non-convex function of the parameter θ. The problem with this is that there are multiple local minima and there is no guarantee that the Gradient Descent will converge at the global minima.

non-convex function when i=1

So we need a convex function of parameter θ which might look as follows.

convex function

In order to obtain a convex function, the cost function of Logistic Regression is modified as follows

y=1
y=0

The cost function can be rewritten as follows,

where y=0 or 1 only

To get θ we must minimize J(θ) using Gradient Descent

1.2 Interpretation of hypothesis function output

2. Logistic Regression for multiclass classification

Binary classification models like logistic regression do not support multi-class classification natively and require meta-strategies. The One-vs-Rest strategy splits a multi-class classification into one binary classification problem per class. A binary classifier is then trained on each binary classification problem and predictions are made using the model that is the most confident.

Suppose we have a classification problem with the following three classes

We will train a logistic regression classifier hθi(x) for each class ‘i’ to predict the probability that ‘y=i’

hθ1(x)
hθ2(x)
hθ3(x)

To make a prediction on a new data point, pick class ‘i’ that maximizes hθ(x)

3. Python library to implement Logistic Regression

from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression()

4. References

  1. https://www.youtube.com/watch?v=vaQxdBEcBzU&list=PLZoTAELRMXVPjaAzURB77Kz0YXxj65tYz&index=3
  2. https://machinelearningmastery.com/one-vs-rest-and-one-vs-one-for-multi-class-classification/

--

--

Rohan Paris
Rohan Paris

Responses (2)