Solving Sudoku with backtracking : Algorithms for the masses

Solving Sudoku with backtracking

I’m pretty sure that the puzzle game of Sudoku needs no introduction. A 9x9 grid formed as a 3x3 grid of 3x3 boxes, with some digits from 1 to 9 in there, and you have to solve for the empty cells such that there are no duplicate digits in each row, column or box. I enjoy playing it, sometimes I’m pretty good, sometimes … not so much. Of course, these puzzles are not generated by hand, they’re generated by some program, but how? How do you solve a Sudoku puzzle programmatically? And how, using that knowledge, can you generate a new puzzle?

Example Sudoku puzzle

If you pose this Sudoku-solving question to a group of developers, I think the initial answer would look something like this. Find the first empty cell. Put a 1 in there. Check to see whether that digit violates one of the Sudoku rules (that is, check to see whether there is already a 1 in the same row, or column, or box). If it’s valid, find the next empty cell and repeat the algorithm. If there is a clash though, increment the digit you just added and check again whether it already appears in the same row, column, or box. Repeat this inner loop until either you find a valid digit for that cell and move on as before, or until you run out of digits. In the latter case, you must backtrack to the previous cell you filled in and increment the digit you find there and check and move forward again. Sometimes, as I’m sure you can appreciate, you’ll find that you backtrack several cells, but nevertheless, if the puzzle has a solution, you’ll find it.

This type of algorithm is known as a depth-first search algorithm: we try to move forward as quickly as possible (increasing the depth to which we advance), backtracking if necessary to try again.

The problem with this algorithm is the same problem as with all backtracking algorithms: you’ve got to save some kind of state as you move forward, so that you can restore it when you need to backtrack. The amount of state can be small or it can be pretty large and in the latter case it can sink your algorithm.

Anyway, for the Sudoku example, the simplistic backtracking algorithm is fairly easy to write. The way I did it was to have a stack of cell addresses I’d visited and modified.

  var createBacktrackStack = function () {
    var stack = [];
    var popCount = 0;

    var push = function (i) {
      stack.push(i);
    };

    var pop = function () {
      if (stack.length === 0) {
        return null;
      }
      popCount += 1;
      return stack.pop();
    };

    return {
      push : push,
      pop : pop,
      getBacktrackCount : function () { return popCount; }
    };
  };

Then all I needed to do when backtracking was to pop off the top cell address and I could easily revisit it to increment its value to the next digit.

I made the cell addresses really simple too: just a number from 0 to 80, counting the columns from left to right across each row and then down the rows (so the top row’s cells are 0 to 8, the second row’s cells are 9 to 17, and so on). The reason for doing this is the identification of the cells in the same row is a simple calculation (the first cell in the row that cell X is in is Math.floor(X/9) * 9; the first cell in the same column is X % 9). From those you can easily calculate the addresses of all the cells in the same row (keep adding 1) and column (keep adding 9). The addresses of the cells in the boxes are a little more difficult to calculate, so I just used a simple lookup for the first (top left) cell in each box.

  var checkUniqueDigitInRow = function (grid, cell, digit) {
    var first = Math.trunc(cell/9) * 9;
    for (var i = first; i < first + 9; i += 1) {
      if (grid[i] === digit) {
        return false;
      }
    }
    return true;
  };

  var checkUniqueDigitInColumn = function (grid, cell, digit) {
    var first = cell % 9;
    for (var i = first; i < 81; i += 9) {
      if (grid[i] === digit) {
        return false;
      }
    }
    return true;
  };

  var checkUniqueDigitInBox = function (grid, cell, digit) {
    const topLeftCell = [
       0, 0, 0, 3, 3, 3, 6, 6, 6,
       0, 0, 0, 3, 3, 3, 6, 6, 6,
       0, 0, 0, 3, 3, 3, 6, 6, 6,
      27,27,27,30,30,30,33,33,33,
      27,27,27,30,30,30,33,33,33,
      27,27,27,30,30,30,33,33,33,
      54,54,54,57,57,57,60,60,60,
      54,54,54,57,57,57,60,60,60,
      54,54,54,57,57,57,60,60,60
    ];
    const boxOffsets = [0, 1, 2, 9, 10, 11, 18, 19, 20];

    var first = topLeftCell[cell];
    for (var i = 0; i < 10; i += 1) {
      if (grid[first + boxOffsets[i]] === digit) {
        return false;
      }
    }
    return true;
  };

  var checkValidDigit = function (grid, cell, digit) {
    return checkUniqueDigitInRow(grid, cell, digit) && 
           checkUniqueDigitInColumn(grid, cell, digit) &&
           checkUniqueDigitInBox(grid, cell, digit);
  };

The puzzle “grid” I just store in a global variable, an array of integers, with empty cells set to zero. All guesses the program makes are made to this global array. When I need to backtrack because the current cell’s digit cannot be incremented, I set the current cell to zero before popping the previous cell address off the (also global) stack. That way, my cell validation routines will continue to work: there will be no digit “droppings” left behind on a backtrack operation.

  var findEmptyCell = function (grid, cell) {
    for (var i = cell; i < 81; i += 1) {
      if (grid[i] === 0) {
        return i;
      }
    }

    return noEmptyCells;
  };  

  var solvePuzzle = function (grid) {
    var cellStack = createBacktrackStack();
    cellStack.push(noSolution);

    var cell = findEmptyCell(grid, 0);

    while (cell !== noEmptyCells) {
      var digit = grid[cell] + 1;
      while ((digit < 10) && !checkValidDigit(grid, cell, digit)) {
        digit += 1;
      }
      if (digit === 10) {
        grid[cell] = 0;
        cell = cellStack.pop();
        if (cell === noSolution) {
            return false;
        }
      }
      else {
        grid[cell] = digit;
        cellStack.push(cell);
        cell = findEmptyCell(grid, cell + 1);
      }
    }

    console.log("Backtrack count: ", cellStack.getBacktrackCount());
  
    return true;
  };

The listing shows an implementation of the fairly simple backtracking algorithm. Solving a particular puzzle is pretty fast, but it does assume that there is only one solution. I did add code to detect an insoluble puzzle as well (essentially by pushing a sentinel value onto the stack before starting the backtracking algorithm, I can detect when the algorithm tries to backtrack too far).

(function () {
  "use strict";

  const noEmptyCells = -1;
  const noSolution = -2;

  // other code

  var printGrid = function (grid) {
    var cell = 0;
    console.log("");

    for (var row = 0; row < 9; row += 1) {
      var line = "";
      for (var col = 0; col < 9; col += 1) {
        if (grid[cell] === 0) {
          line += " ·";
        }
        else {
          line += " " + grid[cell];
        }
        if ((col === 2) || (col === 5)) {
          line += " |";
        }
        cell += 1;
      }
      console.log(line);

      if ((row === 2) || (row === 5)) {
        console.log("-------+-------+-------");
      }
    }
  };

  var grid = [
    0,6,0,0,0,0,0,9,0,
    9,0,0,8,7,3,0,0,1,
    5,0,0,0,0,0,0,0,4,
    2,0,0,6,0,4,0,0,8,
    0,0,6,0,8,0,2,0,0,
    7,0,0,5,0,2,0,0,3,
    8,0,0,0,0,0,0,0,6,
    6,0,0,9,3,8,0,0,7,
    0,2,0,0,0,0,0,4,0
  ];

  if (solvePuzzle(grid)) {
    console.log("Solution:");
    printGrid(grid);
  }
  else {
    console.log("Grid is not solvable");
  }
}());

It’s interesting to run this code on various Sudoku examples, especially those that are termed “Hard” or “Very hard” (the example puzzle shown above was termed “Medium” by the app I use, and it took me just under 4 minutes to solve by hand). In general, standard puzzles are pretty rapid to solve with this code.

There is one example though that has been deliberately devised to show the flaw in this simple backtracking algorithm. In essence, the first row of the puzzle is all blank, and the answer to that row will be 987654321. This would exercise the backtracking code enormously, since it would start each empty cell with 1, which would be the worst guess completely (for that first cell it would have to try all possibilities for a 1, then a 2, than a 3, all the way up to 9). As an example: the puzzle shown in the code backtracked 5,882 times, this diabolical one backtracked 69,175,252 times.

One possible way round this might be to randomly choose empty cells, rather than always start at the top-left and then work our way down the grid row by row in order. I played around with this idea a little bit. Basically the code found all the empty cells at the start and then shuffled them. The findEmptyCell() function would then release them one by random one. However, my experiment showed that it made solving simpler puzzles (those which have some symmetry) extraordinarily slow.

Next time I’ll talk about generating new puzzles.

Example Sudoku puzzle - banner

Wed 8-Jan-2020 6:45 PM Blog / tags: sudoku backtracking

Loading links to posts on similar topics...

previous post next post

5 Responses

#1 John Topley said...

09-Jan-20 2:00 AM

This is fascinating - thanks!

#2 Oliver Sturm said...

09-Jan-20 6:34 AM

Hm... but wouldn't a recursive solution be much more elegant?

#3 julian m bucknall said...

09-Jan-20 8:38 AM

Oliver: I'll get back to you on that...

#4 Oliver Sturm said...

16-Jan-20 2:45 AM

I sent this comment yesterday, but there was no confirmation from the system after I clicked Submit and I wasn't sure it was actually sent. Weird. Trying one more time.

Just for fun, I created this: https://github.com/oliversturm/sudoku-solver

It's a nice functional and recursive implementation of a backtracking solver, with a React frontend. The recursion itself is here. I like it better than all those pesky for loops :)

Based on this algorithm it's easy to pre-process the list of empty cells in order to change the order the solver considers them. I agree with your findings that shuffling doesn't do average puzzles any favors - they can take a very long time. I didn't add the shuffling option in the UI, but it's available in code.

#5 julian m bucknall said...

16-Jan-20 1:51 PM

Nice one, Oliver!

Leave a response

Note: some MarkDown is allowed, but HTML is not. Expand to show what's available.

Emphasize with italics: surround word with underscores _emphasis_
Emphasize strongly: surround word with double-asterisks **strong**
Link: surround text with square brackets, url with parentheses [text](url)
Inline code: surround text with backticks `IEnumerable`
Unordered list: start each line with an asterisk, space * an item
Ordered list: start each line with a digit, period, space 1. an item
Insert code block: start each line with four spaces
Insert blockquote: start each line with right-angle-bracket, space > Now is the time...

by Julian M Bucknall