Sudoku recognizer

Alejandro Hernández Cordero (GitHub ID: ahcorde)

This notebook details how to recognize a sudoku puzzle from a picture. We'll make use of simple image processing algorithms (edge detection, thresholds... ) and character recognition using the K-Nearest Neighbors (KNN) algorithm. It is a very simple but effective algorithm for solving multi-class classification problems. This puzzle matrix is a 9x9 array of known numbers 1-9 or 0s where the number is unknown

This ipython notebook is divided into two parts. The first part is related to computer vision and the second part is related to machine learning. To complete this task it is necessary to use a computer vision library, in this case we are going to use OpenCV library.


The first thing we need to do is to identify the puzzle. We have some challenges: The lines are not perfect, the black grid lines have similar color as a lot of elements in the image or the small squares are difficult to extract.

To simplify the problem we assumed: the puzzle is the biggest square in the image and the puzzle will be orientated reasonably correctly.

In the next picture you can see a sudoku puzzle in a newspaper. To load the image it's used imread.

In [1]:
%matplotlib inline
import pylab as pl
#import Opencv library
    import cv2
except ImportError:
    print "You must have OpenCV installed"
#load image
image_sudoku_original = cv2.imread('../../../data/ocr/sudoku.jpg')
#Show Images

Step 1: Segmenting the Sudoku

Once we have read the image we have to detect the lines. For this task, It used an adaptativeThreshold to extract the edge of the image (for each pixel in the image take the average value of the surrounding area). This function accepts a gray scale image, with the function cvtColor we change the color space (from RGB to gray scale). In this picture it's possible to see letters, lines or numbers. Light pixels are the paper and dark pixels are the ink.

The adaptativeThreshold function takes as first argument the input image, the second argument returns the binary image, the third argument is the non-zero value assigned to the pixels for which the condition is satisfied. The fourth is the adaptive thresholding algorithm, the fifth argument is the threshold type, the next argument is size of the pixel neighborhood and finally the last argument is a constant that is subtracted from the mean or weight mean.

In [2]:
#gray image
image_sudoku_gray = cv2.cvtColor(image_sudoku_original,cv2.COLOR_BGR2GRAY)
#adaptive threshold
thresh = cv2.adaptiveThreshold(image_sudoku_gray,255,1,1,11,15)

#show image
_=pl.imshow(thresh, cmap=pl.gray())

We have to find the connected contours in the image (using the function findContours). These function returns a vector with the corners of the contours. We'll find the sukodu in this contours.

The assumption is that the sudoku puzzle has 4 sides and it's convex. Checking the number of the contour is equal to four and using the funcion of OpenCV isContourConvex to check if the square is convex. We obtain the possible candidates.

We need to filter the candidates using the assumption: the sudoku puzzle is the biggest square in the image. We calculate the area of the possible contours (contourArea). The biggest square is the sudoku puzzle.

In [3]:
#find the countours 
contours0,hierarchy = cv2.findContours( thresh,
#size of the image (height, width)
h, w = image_sudoku_original.shape[:2]

#copy the original image to show the posible candidate
image_sudoku_candidates = image_sudoku_original.copy()

#biggest rectangle
size_rectangle_max = 0; 
for i in range(len(contours0)):
    #aproximate countours to polygons
    approximation = cv2.approxPolyDP(contours0[i], 4, True)
    #has the polygon 4 sides?
    if(not (len (approximation)==4)):
    #is the polygon convex ?
    if(not cv2.isContourConvex(approximation) ):
    #area of the polygon
    size_rectangle = cv2.contourArea(approximation)
    #store the biggest
    if size_rectangle> size_rectangle_max:
        size_rectangle_max = size_rectangle 
        big_rectangle = approximation

In the image below it's possible to see the 4 sides of the sudoku puzzle in red

In [4]:
#show the best candidate
approximation = big_rectangle
for i in range(len(approximation)):
             (big_rectangle[(i%4)][0][0], big_rectangle[(i%4)][0][1]), 
             (big_rectangle[((i+1)%4)][0][0], big_rectangle[((i+1)%4)][0][1]),
             (255, 0, 0), 2)
#show image
_=pl.imshow(image_sudoku_candidates, cmap=pl.gray()) 

Now we have the sudoku puzzle segmented. We have got the corner points of the puzzle. It's currently not really usable for much. The sudoku puzzle is a bit distorted. It's necessary to correct the skewed perspective of the image. We need a way to mapping from the puzzle in the original picture back into a square. Where each corner of the sudoku puzzle corresponds to a corner on the a new image.

We use a transformation that will map one arbitrary 2D quadrilateral into another. We can use a perspective transformation:

$$X = \frac{ax + by + c}{gx + hy +1}$$$$Y = \frac{dx + ey + f}{gx + hy +1}$$

This perspective transformation maps a point $ (x, y) $ in one quadrilateral into a new $ (X, Y) $ in another quadrilateral. These two equations contain 8 unknowns, but we have 8 values. (the corners $x$ and $y$ coordinates of the puzzle). Solving these equations gives us the $a,b,c,d,e,f,g,h$ which provide us with a mapping to get our puzzle out nice and straight.

The OpenCV function getperspectivetransform resolved the perspective transformation. Calculates a perspective transform from four pairs of the corresponding points, where the first parameter are the coordinates of quadrangle vertices in the source image and the second parameter are the coordinates of the corresponding quadrangle vertices in the destination image.

To applies a perspective transformation to an image we used the function warpPerspective. This function transforms the source image using the specified matrix. We obtains the image below.

We need to sort the corner of the sudoku puzzle and then associate each point with the new image dimension.

In [5]:
import numpy as np

#sort the corners to remap the image
def getOuterPoints(rcCorners):
    ar = [];
    x_sum = sum(rcCorners[x, 0, 0] for x in range(len(rcCorners)) ) / len(rcCorners)
    y_sum = sum(rcCorners[x, 0, 1] for x in range(len(rcCorners)) ) / len(rcCorners)
    def algo(v):
        return (math.atan2(v[0] - x_sum, v[1] - y_sum)
                + 2 * math.pi) % 2*math.pi
    return (  ar[3], ar[0], ar[1], ar[2])

The dataset images have 16x16 pixels. The size of the new image will be 144x144, because we have to divide each row and col by 9 and we have to return the same size of the images.

In [6]:
#point to remap
points1 = np.array([
                    np.array([0.0,0.0] ,np.float32) + np.array([144,0], np.float32),
                    np.array([0.0,0.0] ,np.float32),
                    np.array([0.0,0.0] ,np.float32) + np.array([0.0,144], np.float32),
                    np.array([0.0,0.0] ,np.float32) + np.array([144,144], np.float32),
outerPoints = getOuterPoints(approximation)
points2 = np.array(outerPoints,np.float32)

#Transformation matrix
pers = cv2.getPerspectiveTransform(points2,  points1 );

#remap the image
warp = cv2.warpPerspective(image_sudoku_original, pers, (SUDOKU_SIZE*IMAGE_HEIGHT, SUDOKU_SIZE*IMAGE_WIDHT));
warp_gray = cv2.cvtColor(warp, cv2.COLOR_BGR2GRAY)

#show image
_=pl.imshow(warp_gray, cmap=pl.gray())