💢Number Plate Recognition using OpenCV Python

📍Steps involved in License Plate Recognition

2. Character Segmentation: Once we have detected the License Plate we have to crop it out and save it as a new image. Again this can be done easily using OpenCV.

3. Character Recognition: Now, the new image that we obtained in the previous step is sure to have some characters (Numbers/Alphabets) written on it. So, we can perform OCR (Optical Character Recognition) on it to detect the number


  • OpenCV: OpenCV is a library of programming functions mainly aimed at real-time computer vision plus its open-source, fun to work with and my personal favorite. I have used version 4.1.0 for this project.
  • Python: aka swiss army knife of coding. I have used version 3.6.7 here.
  • IDE: I’ll be using Jupyter here.
  • Haar cascade: It is a machine learning object detection algorithm used to identify objects in an image or video and based on the concept of ​​ features proposed by Paul Viola and Michael Jones in their paper “Rapid Object Detection using a Boosted Cascade of Simple Features” in 2001. More info
  • Keras: Easy to use and widely supported, Keras makes deep learning about as simple as deep learning can be.
  • Scikit-Learn: It is a free software machine learning library for the Python programming language.

Step 1

# installing OpenCV
>pip install opencv-python==4.1.0
# Installing Keras
>pip install keras
# Installing Jupyter
>pip install jupyter
# Installing Scikit-Learn
>pip install scikit-learn

Step 2

We’ll start with running jupyter notebook and then importing necessary libraries in our case OpenCV, Keras and sklearn.

let’s import the libraries

#importing openCV
>import cv2#importing numpy
>import numpy as np#importing pandas to read the CSV file containing our data
>import pandas as pd#importing keras and sub-libraries
>from keras.models import Sequential
>from keras.layers import Dense
>from keras.layers import Dropout
>from keras.layers import Flatten, MaxPool2D
>from keras.layers.convolutional import Conv2D
>from keras.layers.convolutional import MaxPooling2D
>from keras import backend as K
>from keras.utils import np_utils
>from sklearn.model_selection import train_test_split

Step 3

Let’s start simple by importing a sample image of a car with a license plate and define some functions:

def extract_plate(img): # the function detects and perfors blurring on the number plate.
plate_img = img.copy()

#Loads the data required for detecting the license plates from cascade classifier.
plate_cascade = cv2.CascadeClassifier(‘./indian_license_plate.xml’)
# detects numberplates and returns the coordinates and dimensions of detected license plate’s contours.
plate_rect = plate_cascade.detectMultiScale(plate_img, scaleFactor = 1.3, minNeighbors = 7)
for (x,y,w,h) in plate_rect:
a,b = (int(0.02*img.shape[0]),
int(0.025*img.shape[1])) #parameter tuning
plate = plate_img[y+a:y+h-a, x+b:x+w-b, :]
# finally representing the detected contours by drawing rectangles around the edges.
cv2.rectangle(plate_img, (x,y), (x+w, y+h), (51,51,255), 3)

return plate_img, plate # returning the processed image

The above function works by taking image as input, then applying ‘haar cascade’ that is pre-trained to detect Indian license plates, here the parameter scale Factor stands for a value by which input image can be scaled for better detection of license plate. min Neighbors is just a parameter to reduce false positives, if this value is low, the algorithm may be more prone to giving a misrecognized outputs.

Step 4

Now let’s process this image further to make the character extraction process easy. We’ll start by defining some more functions for that.

# Find characters in the resulting images
def segment_characters(image) :
# Preprocess cropped license plate image
img = cv2.resize(image, (333, 75))
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, img_binary = cv2.threshold(img_gray, 200, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
img_erode = cv2.erode(img_binary, (3,3))
img_dilate = cv2.dilate(img_erode, (3,3))
LP_WIDTH = img_dilate.shape[0]
LP_HEIGHT = img_dilate.shape[1]
# Make borders white
img_dilate[0:3,:] = 255
img_dilate[:,0:3] = 255
img_dilate[72:75,:] = 255
img_dilate[:,330:333] = 255
# Estimations of character contours sizes of cropped license plates
dimensions = [LP_WIDTH/6, LP_WIDTH/2, LP_HEIGHT/10, 2*LP_HEIGHT/3]
# Get contours within cropped license plate
char_list = find_contours(dimensions, img_dilate)
return char_list

The above function takes in the image as input and performs the following operation on it-

  • resizes it to a dimension such that all characters seem distinct and clear
  • convert the colored image to a grey scaled image i.e instead of 3 channels (BGR), the image only has a single 8-bit channel with values ranging from 0–255 where 0 corresponds to black and 255 corresponds to white. We do this to prepare the image for the next process.
  • The image is now in binary form and ready for the next process Eroding.
    Eroding is a simple process used for removing unwanted pixels from the object’s boundary meaning pixels that should have a value of 0 but are having a value of 1.
  • The next step now is to make the boundaries of the image white. This is to remove any out of the frame pixel in case it is present.
  • we have reduced our image to a processed binary image and we are ready to pass this image for character extraction.

Step 5

import numpy as np
import cv2
# Match contours to license plate or character template
def find_contours(dimensions, img) :
# Find all contours in the image
cntrs, _ = cv2.findContours(img.copy(), cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# Retrieve potential dimensions
lower_width = dimensions[0]
upper_width = dimensions[1]
lower_height = dimensions[2]
upper_height = dimensions[3]

# Check largest 5 or 15 contours for license plate or character respectively
cntrs = sorted(cntrs, key=cv2.contourArea, reverse=True)[:15]

x_cntr_list = []
target_contours = []
img_res = []
for cntr in cntrs :
#detects contour in binary image and returns the coordinates of rectangle enclosing it
intX, intY, intWidth, intHeight = cv2.boundingRect(cntr)

#checking the dimensions of the contour to filter out the characters by contour’s size
if intWidth > lower_width and intWidth < upper_width and intHeight > lower_height and intHeight < upper_height :
x_cntr_list.append(intX) #stores the x coordinate of the character’s contour, to used later for indexing the contours
char_copy = np.zeros((44,24))
#extracting each character using the enclosing rectangle’s coordinates.
char = img[intY:intY+intHeight, intX:intX+intWidth]
char = cv2.resize(char, (20, 40))
# Make result formatted for classification: invert colors
char = cv2.subtract(255, char)
# Resize the image to 24x44 with black border
char_copy[2:42, 2:22] = char
char_copy[0:2, :] = 0
char_copy[:, 0:2] = 0
char_copy[42:44, :] = 0
char_copy[:, 22:24] = 0
img_res.append(char_copy) #List that stores the character’s binary image (unsorted)
#Return characters on ascending order with respect to the x-coordinate (most-left character first)

#arbitrary function that stores sorted list of character indeces
indices = sorted(range(len(x_cntr_list)), key=lambda k: x_cntr_list[k])
img_res_copy = []
for idx in indices:
img_res_copy.append(img_res[idx])# stores character images according to their index
img_res = np.array(img_res_copy)

return img_res

After step 4 we should have a clean binary image to work on. In this step, we will be applying some more image processing to extract the individual characters from the license plate.

Step 6

  • The data is all clean and ready, now it’s time do create a Neural Network that will be intelligent enough to recognize the characters after training.
  • For modeling, we will be using a Convolutional Neural Network with 3 layers.
## create model
>model = Sequential()
>model.add(Conv2D(filters=32, kernel_size=(5,5), input_shape=(28, 28, 1), activation='relu'))
>model.add(MaxPooling2D(pool_size=(2, 2)))
>model.add(Dense(units=128, activation='relu'))
>model.add(Dense(units=36, activation='softmax'))
  • To keep the model simple, we’ll start by creating a sequential object.
  • The first layer will be a convolutional layer with 32 output filters, a convolution window of size (5,5), and ‘Relu’ as activation function.
  • Next, we’ll be adding a max-pooling layer with a window size of (2,2).
    Max pooling is a sample-based discretization process. The objective is to down-sample an input representation (image, hidden-layer output matrix, etc.), reducing its dimensionality and allowing for assumptions to be made about features contained in the sub-regions binned.
  • Now, we will be adding some dropout rate to take care of overfitting.
    Dropout is a regularization hyperparameter initialized to prevent Neural Networks from Overfitting. Dropout is a technique where randomly selected neurons are ignored during training. They are “dropped-out” randomly.
  • Now it’s time to flatten the node data so we add a flatten layer for that. The flatten layer takes data from the previous layer and represents it in a single dimension.
  • Finally, we will be adding 2 dense layers, one with the dimensionality of the output space as 128, activation function=’relu’ and other, our final layer with 36 outputs for categorizing the 26 alphabets (A-Z) + 10 digits (0–9) and activation function=’ softmax’

Step 7

  • The data we will be using contains images of alphabets (A-Z) and digits (0–9) of size 28x28, also the data is balanced so we won’t have to do any kind of data tuning here.
  • It’s time to train our model now!
    we will use ‘categorical_crossentropy’ as loss function, ‘Adam’ as optimization function and ‘Accuracy’ as our error matrix.
import datetime
class stop_training_callback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs={}):
if(logs.get('val_acc') > 0.992):
self.model.stop_training = True

log_dir="logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)

batch_size = 1
callbacks = [tensorboard_callback, stop_training_callback()]
steps_per_epoch = train_generator.samples // batch_size,
validation_data = validation_generator,
validation_steps = validation_generator.samples // batch_size,
epochs = 80, callbacks=callbacks)
  • After training for 23 epochs, the model achieved an accuracy of 99.54%.

Step 8

Finally, its time to test our model, remember the binary images of extracted characters from number plate? Let’s feed the images to our model!

Thanks….!!! for reaching here…😊



Computer Engineer 👩🏼‍💻

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store