Preprocessor

Current functions

Different OpenCV functions are applied in the preprocessing, which are explained at http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/

Currently, the following functions are implemented in the RekenRobot:

  • grey-scale
  • gaussian_blur: low-pass filtering, to remove noise
  • filter2d: 2d convolution(increases amount of pixels?), kind of low-pass filtering, to remove noise
  • binary_threshold (convert to binary image)
  • normalization: increase contrast, by stretching the histogram to the whole range.
  • white_boarder: puts a white boarder around image (does not seem to be usefull…)
  • median_blur: to remove salt-and-pepper noise
  • erosion: erodes white image. As the numbers are black on a white background, the numbers actually get dilated with the erosion function.

All of these functions can be used, although the default preprocessor does not apply the normalization and white border.

Order of preprocessing functions

Current default order: 1. Grayscale 2. Gaussian blur 3. 2d convolution 4. Binary threshold 5. Median blur 6. Erosion

  • Grayscale should be performed before thresholding.
  • Gaussian blur, 2d convolution and median blur can be performed both before and after grayscale.
  • Gaussian blur and 2d convolution can be done after thresholding, but might introducing new grey-values.
  • Erosion should be performed on binary image.
  • Normalization should be performed after grey-scaling, but before thresholding (currently not in default preprocessing).

Possible improvements

Roughly sorted from most to least promising.

Added functionalities

We implemented both the adaptive treshold and the image deformation (perspective_transform) as preprocessor step (with as name binary_adaptive_threshold). The results of this is highly dependent on the input image:

../../_images/adapt_thr.png

Original “real image” and after different preprocessing options

../../_images/adapt_thr2.png

Original “good image” and after different preprocessing options (scaled at 25%)

The image deformation does exactly the opposite of the deformations applied to the test images. Therefore, this works perfectly on the test images, but worse on real images. If tested using the webcam, it does not appear to have much added value.

Using a test set of 200 images (4 digits, 0 and 8 excluded, including blurring, noise and deformation, clean background), we compared the default preprocession (gray_scale, gaussian_blur, filter2d, binary_threshold, median_blur, erosion) with the new preprocessing (gray_scale, perspective_transform, filter2d, binary_adaptive_threshold). The succes rate of the default preprocessing was 0%, and of the improved preprocessing 74%.

If testing with actual webcam images, the “improved” preprocessing (gray_scale, filter2d, binary_adaptive_threshold and median blur) shows much better results than after the default preprocessing. However, the border shadows sometimes appear black in the image, resulting in additional recognized numbers by the OCR module.

../../_images/impr_pp.png

Real calculator image with default and improved preprocessing.