Improving the Sudoku Solver

9143361795_a3f7d34867_o

Image: Graniers (Flickr / creative commons)

I had planned to write about something else, but the poor efficiency of the list-based Sudoku Solver keeps nagging me. I think I need to address that before venturing into other topics.

The list-based Sudoku Solver was my idea of the ”simplest-thing-that-could-work” outside or above the brute-force approaches. Since the board of the puzzle was maintained in a list of cells, all the access meant iterating through 81 elements and filtering what ever was needed, which added up to a lot of unnecessary work.

In the previous post, I sketched a couple of ideas for improvement. Although parallel computation might make things to happen faster, it does not make the approach any more efficient. I would prefer to tackle the problem with data structures and algorithms. Since my efforts at elegant tail-recursive solvers have remained inelegant, the list of improvements boils down to two: two-dimensional array for data structure and more elaborate pruning.

The array-based Sudoku board makes use of the array’s indexed positions when accessing cells, rows, columns etc. (see, the complete board class here).

case class ArrayBasedBoard(grid: Array[Array[Cell]]) {

  override def cell(x: Int, y: Int): Cell = grid(y)(x)

  override def row(y: Int): List[Cell] = grid(y).toList

  override def col(x: Int): List[Cell] =
    grid.map(r => r(x)).toList

  ...

The basic array in Scala is mutable as it is a representation of Java’s Array class. Although mutability goes against the grain of functional programming, I’ll make blatant use of it. For instance, when guessing, i.e., trying out a value for a cell from a set of candidates, I make copies of the board, and then assign the guessed values to the new grid. Copies of the board are needed as I might need to backtrack if the recursive iteration runs into a dead-end.

def +(c: Cell): ArrayBasedBoard = {
  val newGrid = gridCopy
  newGrid(c.y)(c.x) = c
  ArrayBasedBoard(newGrid)
}

private def gridCopy: Array[Array[Cell]] = {
  grid.map { row =>;
    val newArray = new Array[Cell](row.size)
    row.copyToArray(newArray)
    newArray
  }
}

I used the first 100 samples from University of Warwick‘s msk_009 dataset that contains mostly ”easy” puzzles as a quick touchstone. The data structure seemed to yield an improvement of about 20% in efficiency over the list-based approach — from 1300 ms to about 1000 ms.

However, bigger improvement arose from the changes in pruning. In addition to pruning  based on the neighboring values described in the previous post, the candidates of each cell can be pruned by neighboring candidates. Consider the puzzle below, for example. After eliminating the redundant candidates from each cell based on the initial values (large dark gray numbers), some cells have been solved (large light blue numbers), but this simple pruning cannot go any further. To proceed we would have to make a guess.

sudoku_pair_elimination

However, if the same pair of candidates occurs twice in the same row, column, or a block, that pair occur anywhere else on the same row, column, or a block, respectively. Making use of this rule, we can continue pruning and avoid guessing.

In the lower left corner, two cells with green background have candidate lists (small red numbers) identical comprising 2 and 3. Since 2 and 3 must to occur in those two cells, 2 and 3 cannot occur anywhere else in the given column. Thus, we can eliminate them from the candidate lists of the column. Since both cells occur also in the same bottom left 3 x 3 block, we can extend the pruning to the that block. The area of effect is colored with light blue.

There is another pair in top-right right corner: we have 4 and 6 in the same row and the same block. It can be used to prune cells that are colored in light yellow.

The pruning leads to the following outcome. Three further cells have been solved (green background) and the candidate lists in areas of effect have been pruned. As new cells have been resolved, the ”normal” pruning based on the neighboring values would pick up from here, followed by this kind candidate pair pruning, etc. Quite often, or should I say typically, the Sudoku puzzles in the newspapers can be solved with just pruning.

sudoku_pair_elimination2

This sort of pruning with pairs elimination (see function eliminatePairs below) involves two phases: finding the pairs and pruning the candidate lists. The function twiceOccurringPairs is given the cells of a row, a column, or a block as a parameter, and it finds pairs of candidates that occur exactly twice. To this end, Scala provides quite elegant tools for manipulating lists.

def twiceOccurringPairs(cells: Traversable[Cell]): Set[Set[Int]] = {
  cells.filter(_.cands.size == 2).
    groupBy(_.cands).map(x =>; x._1 ->; x._2.size).
    filter(_._2 == 2).map(_._1).toSet
}

def eliminatePairs(cells: Traversable[Cell]): List[Cell] = {
  twiceOccurringPairs(cells).flatMap { pair =>
    cells.collect {
      case cell if (cell.value.isEmpty &&
              cell.cands != pair &&
              cell.cands.diff(pair) != cell.cands) =>
        Cell(cell.x, cell.y, cell.cands.diff(pair))
    }
  }.toList
}

The elimination boils down to producing a list of cells for which the candidate list has been pruned. Again, the placement of those cells to the board exploits the mutability of the array.

eliminatePairs(board.col(y)).foreach { cell =>
  board.grid(cell.y)(cell.x) = cell
}

Anyway, this pruning works for pairs of identical candidate lists. A similar rule would apply to identical triplets occurring thrice.

The presented pruning improves the efficiency considerably. Each elimination of a candidate pair means avoiding one guess, i.e., it eliminates one branch in the search tree. Using the 100 samples from msk_009 dataset, the pruning cuts the average processing time down to a third — from about 1000 ms to about 300 ms.

The situation is similar with the University of Warwick‘s data set of 95 hard Sudoku puzzles. The list-based solver spent about 51 seconds on the average, while the array-based solver with new pruning spends about 17 seconds on the average (67% improvement). The median time fell from about 5 seconds to 1 second (80% improvement).

The complete source can be found on GitHub.

Sudoku solver in Scala

2698167583_15b7ce9bee_o

Image: Neha Viswanathan (Flickr / creative commons)

I have worked out a quite few Sudoku puzzles on paper, but there has always been a sense of futility in the effort. It’s not that I’m against wasting time, but rather, instead of solving these Japanese numbers puzzles one-by-one, I have felt I should be working on a general solution, a program that can solve any Sudoku puzzle. I think the time has come to get this problem out of the way.

Sudoku is the perfect programming finger exercise: it has simple rules, there is not need for complicate data structures or need to integrate to external interfaces, and it involves no large amounts of data. You can attack it with brute force or try out something more elaborate. There may be regular competitions, I don’t know, but certainly it seems I am not alone: while I was looking for evaluation data, I noticed that Peter Norvig, among others, had posted a Sudoku solver apparently out of pretty much the same motives.

There is probably a wealth of Sudoku strategies and heuristics explained somewhere. Nevertheless, this time I am approaching the exercise with no background work. I’ll work out a simple Sudoku solver in Scala impromptu.

A Sudoku puzzle looks something like this: You have a 9-by-9 board or grid partitioned into 3-by-3 blocks with some numbers filled in. The purpose is to fill in the blanks with numbers from 1 to 9. The same number must occur exactly once in every row, in every column and in every 3-by-3 block.

sudoku_example

With the general solution in mind, the first thing is to decide is how to represent the puzzle or the board. I’ve chosen the simplest structure that I could come up with: a list of individual cells (I’m hoping this will make the recursion easier). Each cell is defined by its coordinates and a set of candidate values, i.e., all values between 1 and 9 that are valid for the cell.

case class Cell(x: Int, y: Int, cands: Set[Int]) {
  def value: Option[Int] = {
    if (cands.size == 1)
      cands.headOption
    else
      None
  }
}

The board is a bit more elaborate. Since it is based on a list of cells, each access of rows, columns, or blocks requires a full scan of the 81 elements of in the list. This is the downside of simple data structure.

case class Board(cells: List[Cell]) {
  def row(y: Int): List[Cell] = {
    cells.filter(_.y == y)
  }
  def col(x: Int): List[Cell] = {
    cells.filter(_.x == x)
  }
  ...
}

The neighborhood of a cell consists of the row, the column and the block of the cell. In the board below, the neighborhood of the red cell is illustrated by yellow row and column together with pinkish block.

sudoku_neighborhood

The idea is that by imposing the constraints, i.e., what neighboring cells already contain, we prune the list of candidates for each cell. In the example above, the list of possible values for the red cell cannot contain 2,3,4,5,7,8, or 9, which leaves us with only 1. And so on. When all the candidate lists are pruned down to one element, the puzzle is finished.

Sometimes, however, pruning is not enough, and we may have to guess. We try out one candidate for a cell and continue from there. Sometimes guesses result in dead-ends, and the board becomes inconsistent by having the same number twice in a row, a column or a block. In such a case, we have to back-track and guess again.

In my simple solver, prune is a tail-recursive function. It goes through all the cells eliminating the candidates based on the cell’s neighborhood. If pruning increases the number of resolved cells without producing any empty candidate lists, we prune further. If no improvement was attained or if pruning was too vigorous, return the board as is.

@tailrec
def prune(board: Board): Board = {
  val countBefore = board.cells.count(_.value.nonEmpty)
  val prunedCells: List[Cell] = board.cells.map { cell =>;
    val prunedCandidates = if (cell.value.nonEmpty)
      cell.cands
    else
      cell.cands.diff(board.neighborhood(cell.x, cell.y).
        flatMap(_.value))
    Cell(cell.x, cell.y, prunedCandidates)
  }
  val countAfter = prunedCells.count(_.value.nonEmpty)
  val empties = prunedCells.count(_.cands.isEmpty)
  if (countBefore < countAfter && empties == 0)
    prune(Board(prunedCells))
  else
    Board(prunedCells)
}

The actual iteration takes place in another recursive function iterate that receives a board as a parameter. If the board is invalid, for instance due to some cells pruned empty of candidates, we stop in failure. On the other hand, if the board contains a complete solution, we return that. Otherwise, we go through the unassigned cells (starting with the one with the shortest candidate list) and try out assigning candidates until we find a solution (or fail completely).

def iterate(board: Board): Option[Board] = {
  if (!board.isValid) {
    None
  } else if (board.isFinished) {
    Some(board)
  } else {
    prune(board).cells.filter(_.value.isEmpty).
      sortBy(_.cands.size).map { cell =>
      cell.cands.map { c =>
        val result = iterate(prune(board +
          Cell(cell.x, cell.y, Set[Int](c))))

        if (result.nonEmpty && result.get.isFinished)
          return result
      }
      return None
    }
    None
  }
}

I am not quite happy with this function. It is inelegant and certainly not tail-recursive. Nevertheless, it gets the job done. Let’s see how well.

The University of Warwick hosts datasets for the purpose of benchmarking automatic Sudoku solvers. Trying out their top95 data set of 95 hard Sudoku puzzles, my program yields an average response time of about 51 seconds with the median at 5 seconds and the maximum at friggin’ 15 minutes! It is faster than pen-and-paper approaches but it’s far from competitive. It is something of a consolation that all hard Sudokus were solved. Thus, the program works correctly, if sluggishly.

To improve the efficiency I could:

  • replace the list-based representation of the board with a two-dimensional array for  faster access to cells, rows, and columns,
  • adopt more effective pruning methods (e.g., if two cells on the same row/column have the same two candidates, those candidates can be eliminated from the rest of the neighborhood),
  • parallellize the computation, and
  • rewrite the iteration to a loop or a proper tail-recursive function.

And so, solving the initial problem of finding a general solution for Sudoku puzzles transforms into a new problem of finding a more efficient general solution. Sigh.

The source code (slightly refactored) can be found on GitHub.