Using Logistic Regression for Image Classification


  • Basic python programming skills
  • Basic knowledge of logistic regression
  • Numpy python library
  • Pandas python library
  • Matplotlib python library
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt



df_x= pd.read_excel('dataset.xlsx', 'X', header=None)
df_y= pd.read_excel('dataset.xlsx', 'Y', header=None)
y = df_y[0]
for i in range(len(y)):
if y[i] != 1:
y[i] = 0
y = pd.DataFrame(y)
x_train = df_x.iloc[0:4000].T
x_test = df_x.iloc[4000:].T
x_train = np.array(x_train)
x_test = np.array(x_test)
y_train = y.iloc[0:4000].T
y_test = y.iloc[4000:].T
y_train = np.array(y_train)
y_test = np.array(y_test)


Formula 1
Formula 2
  • z is the output variable
  • x is the input variable
  • w and b will be initialized as zeros and they will be modified while training the model.
Formula 3
def sigmoid(z):
s = 1/(1 + np.exp(-z))
return s
def initialize_with_zeros(dim):
w = np.zeros(shape=(dim, 1))
b = 0
return w, b
Formula 4
Formula 5 and 6


def propagate(w, b, X, Y):
#Find the number of training data
m = X.shape[1]
#Calculate the predicted output
A = sigmoid(, X) + b)
#Calculate the cost function
cost = -1/m * np.sum(Y*np.log(A) + (1-Y) * np.log(1-A))
#Calculate the gradients
dw = 1/m *, (A-Y).T)
db = 1/m * np.sum(A-Y)

grads = {"dw": dw, "db": db}
return grads, cost
  • A: Predicted output
  • cost: Cost function
  • dw and db: Gradients
def optimize(w, b, X, Y, num_iterations, learning_rate):
costs = []
#propagate function will run for a number of iterations
for i in range(num_iterations):
grads, cost = propagate(w, b, X, Y)
dw = grads["dw"]
db = grads["db"]

#Updating w and b by deducting the dw and db
#times learning rate from the previous w and b
w = w - learning_rate * dw
b = b - learning_rate * db
#Record the cost function value for each 100 iterations
if i % 100 == 0:
#The final updated parameters
params = {"w": w,"b": b}
#The final updated gradients
grads = {"dw": dw,"db": db}

return params, grads, costs
def predict(w, b, X):
m = X.shape[1]
w = w.reshape(X.shape[0], 1)
#Initializing an aray of zeros which has a size of the input
#These zeros will be replaced by the predicted output
Y_prediction = np.zeros((1, m))

#Calculating the predicted output using the sigmoid function
#This will return the values from 0 to 1
A = sigmoid(, X) + b)
#Iterating through A and predict an 1 if the value of A
#is greater than 0.5 and zero otherwise
for i in range(A.shape[1]):
Y_prediction[:, i] = (A[:, i] > 0.5) * 1
return Y_prediction
def model(X_train, Y_train, X_test, Y_test, num_iterations, learning_rate):
#Initializing the w and b as zeros
w, b = initialize_with_zeros(X_train.shape[0])
#Best fit the training data
parameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate)
w = parameters["w"]
b = parameters["b"]
# Predicting the output for both test and training set
Y_prediction_test = predict(w, b, X_test)
Y_prediction_train = predict(w, b, X_train)
#Calculating the training and test set accuracy by comparing
#the predicted output and the original output
print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))
d = {"costs": costs,
"Y_prediction_test": Y_prediction_test,
"Y_prediction_train" : Y_prediction_train,
"w" : w,
"b" : b,
"learning_rate" : learning_rate,
"num_iterations": num_iterations}
return d
ni = 500 # num iterations
lr = 0.005 # learning rate
d = model(x_train, y_train, x_test, y_test, ni, lr)
train accuracy: XX%
test accuracy: XX%
  • Costs
  • Final parameters
  • Predicted Outputs
  • Learning Rate
  • Number of iterarions used
#Plot how cost function changed each updated w's and b's
plt.scatter(x = range(len(d['costs'])), y = d['costs'], color='black')
plt.title('Scatter Plot of Cost Functions', fontsize=18)
plt.ylabel('Costs', fontsize=12)
Scatter Plot of Cost Functions with diferent learning rates and iterations
Multiple values of test and train accuracy depending on learning rate and iterations


  • With each iteration, the cost function went down as it should, that means the parameters w and b kept refining towards perfection.
  • It is worth noting that increasing the value of the learning rate will get us a better % at training accuracy but it will be lower on test accuracy. For this particular case I recommend using 0.015 that had both percentages more even.
  • To change the model for recognizing another digit apart from 1 you can go to the line where it changes all numbers that are not 1 to zero, but instead of 1 use the digit of your preference.
  • Reducing the number of rows for training the model will underfit it and if we increase the number it will overfit it. But you can try and check how it behaves.





Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Bringing black and white photos to life using

More efficient ASTC decoding

Representation of Text for Neural Networks

Unsupervised Learning — A Complete Overview

My few bits on Machine Learning via Kaggle

Computer Vision Application

Evaluating Categorical Models

The right way to set up a Nvidia GPU powered environment for Deep learning in Anaconda

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Juan Arturo Cruz Cardona

Juan Arturo Cruz Cardona

More from Medium

Neural Network Model Balanced Weight For Imbalanced Classification In Keras

Cross-validation and hyper-parameter tuning

Fundamentals of Logistic Regression!

Introduction to decision trees