Preprocessor package

Submodules

Preprocessor.PreprocessImages module

This module is responsible for preprocessing images to make them more suitable for Tesseract-OCR.

class Preprocessor.PreprocessImages.Preprocess

Bases: object

This class preprocesses the image by applying filters and manipulations.

static gray_scale(image)

Applies grayscale on the image.

Parameters:image – The image to be preprocessed.
Returns:Grayscaled image.
static gaussian_blur(image)

Applies gaussian blur on the image.

Parameters:image – The image to be preprocessed.
Returns:Gaussian blurred image.
static filter2d(image)

Applies 2D convolution on the image.

Parameters:image – The image to be preprocessed.
Returns:2D Convoluted image.
static binary_threshold(image)

Applies a binary threshold combined with Otsu’s binarization on the image.

Parameters:image – The image to be preprocessed.
Returns:Binary thresholded image (i.e. black and white).
static binary_adaptive_threshold(image)

Applies a adaptive binary threshold.

Parameters:image – The image to be preprocessed.
Returns:Binary thresholded image (i.e. black and white).
static normalisation(image)

Applies a histogram based normalisation to image to increase contrast.

Parameters:image – The image to be preprocessed.
Returns:Normalisated image
static white_boarder(image)

Applies a white boarder to image the size of the hight of the given image

Parameters:image – The image to be preprocessed.
Returns:Image with white boarder
static median_blur(image)

Applies a median blur on the image.

Parameters:image – The image to be preprocessed.
Returns:Median blurred image.
static erosion(image)

Applies erosion on the image.

Parameters:image – The image to be preprocessed.
Returns:Eroded image.
static perspective_transform(image)

Corrects for deformations created by the camera.

Parameters:image – the (color) image to be preprocessed.
Returns:Corrected image
static preprocess_image(pp_sequence, image)

This function runs the preprocessing based on a supplied selection and sequence.

The getattr() function is used to call a function based on a String value. For example, the string ‘erosion’ can be used to call the method erosion() in this module.

Parameters:
  • pp_sequence – The selection and sequence of the preprocessor operations
  • image – The image to be preprocessed.
Returns:

Preprocessed image.

Module contents