Preprocessor¶
Current functions¶
Different OpenCV functions are applied in the preprocessing, which are explained at http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/
Currently, the following functions are implemented in the RekenRobot:
- grey-scale
- gaussian_blur: low-pass filtering, to remove noise
- filter2d: 2d convolution(increases amount of pixels?), kind of low-pass filtering, to remove noise
- binary_threshold (convert to binary image)
- normalization: increase contrast, by stretching the histogram to the whole range.
- white_boarder: puts a white boarder around image (does not seem to be usefull…)
- median_blur: to remove salt-and-pepper noise
- erosion: erodes white image. As the numbers are black on a white background, the numbers actually get dilated with the erosion function.
All of these functions can be used, although the default preprocessor does not apply the normalization and white border.
Order of preprocessing functions¶
Current default order: 1. Grayscale 2. Gaussian blur 3. 2d convolution 4. Binary threshold 5. Median blur 6. Erosion
- Grayscale should be performed before thresholding.
- Gaussian blur, 2d convolution and median blur can be performed both before and after grayscale.
- Gaussian blur and 2d convolution can be done after thresholding, but might introducing new grey-values.
- Erosion should be performed on binary image.
- Normalization should be performed after grey-scaling, but before thresholding (currently not in default preprocessing).
Possible improvements¶
Roughly sorted from most to least promising.
- Binary treshold: At this moment, a global threshold is applied. This threshold can be made adaptive, in order to better handle varying contrasts. (see http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_thresholding/py_thresholding.html#thresholding)
- Improve order of functions, and check if all functions improve the preprocessing.
- Perspective transformation: might be added to correct for the perspective (i.e. the deformation created by the camera).
- At this moment, the normalization is not working properly, while this might improve the preprocessing. Further looking into the normalization might therefore be usefull. Adaptive histogram equalization might be used (see http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_histograms/py_histogram_equalization/py_histogram_equalization.html#py-histogram-equalization), but I’m not sure if that would improve the normalization in our case.
- Gaussian filtering might be replaced by bilateral filtering. With bilateral filtering, a gaussian filter is used as well but it also considers intensity differences. This should keep edges sharper than with gaussian filtering, but it might be slower. (see http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_filtering/py_filtering.html#filtering)
- High pass filtering: as the contrast between the numbers and background is large, high-pass filtering might be used to detect the numbers as well (using laplacian derivatives, http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_gradients/py_gradients.html#gradients)
- Erosion: Erosion and dilatation can be applied to remove noise as wel:
- first erode, then dilate will remove noise in the numbers (black area)
- first dilate, then erode will remove noise in the background (white area)
Added functionalities¶
We implemented both the adaptive treshold and the image deformation (perspective_transform) as preprocessor step (with as name binary_adaptive_threshold). The results of this is highly dependent on the input image:
The image deformation does exactly the opposite of the deformations applied to the test images. Therefore, this works perfectly on the test images, but worse on real images. If tested using the webcam, it does not appear to have much added value.
Using a test set of 200 images (4 digits, 0 and 8 excluded, including blurring, noise and deformation, clean background), we compared the default preprocession (gray_scale, gaussian_blur, filter2d, binary_threshold, median_blur, erosion) with the new preprocessing (gray_scale, perspective_transform, filter2d, binary_adaptive_threshold). The succes rate of the default preprocessing was 0%, and of the improved preprocessing 74%.
If testing with actual webcam images, the “improved” preprocessing (gray_scale, filter2d, binary_adaptive_threshold and median blur) shows much better results than after the default preprocessing. However, the border shadows sometimes appear black in the image, resulting in additional recognized numbers by the OCR module.