HOW TO SCORE SUDOKU

How hard is that puzzle?

Is it difficult?
You cannot tell by looking:
You have to try it!

If you have been doing sudoku puzzles from this site for a while, you have probably noticed that virtually all the puzzles have the same number of clues. You may have wondered, how can one puzzle be easy and another difficult, when they both have 28 clues given? This article will explain the logic behind puzzle scoring: that is, the determination of how difficult a sudoku puzzle is ranked.

What does not determine difficulty

Two things are obvious upon looking at a sudoku puzzle:
  1. The number of clues
  2. The placement of clues

A puzzle with 50 clues is bound to be easier than a puzzle with 20 clues. But aside from extreme cases like that, the number of clues does not tell you much about the puzzle's difficulty ranking. There might be, for instance, a 30 clue sudoku which is more difficult than an easy 20 clue sudoku.

Likewise, the placement of clues sometimes has an impact. A puzzle with 8 clues in the middle box is going to make the 9th cell in that box very easy to determine. A sudoku with several "3 consecutive cells within one box" constructs is usually easier than a sudoku without such a construct. But two puzzles with the same placement of clues can be radically different in difficulty level.

What does determine difficulty

The primary factor which determines difficulty is the number of cells which can be solved at a given point. Suppose you have two puzzles, each with 10 cells left to fill in. On the first, you can immediately figure the value of 9 of those 10. On the second, you can only immediately figure the value of 4 of those 10. Then the second sudoku is the harder one at that point.

Ultimately, of course, every correctly solved sudoku "converges" towards an equal difficulty. The last cell is always the easiest to fill out.

An Example

Consider the following two puzzles. They have been constructed to be as similar as possible, on the surface of things. They both contain the same number of clues, and the clues are moreover in the same location. In fact, the upper left hand box is identical between the two puzzles. But the first is far easier than the second!
an easier sudoku puzzle a harder sudoku puzzle

Notice that each puzzle has some cells high-lighted in red. Those are the cells which can be immediately determined from the clues. That is to say, without filling in anything else, you ought to be able to figure out what any red cell's value is. (Try it!)

So now the relatively difficulty of the puzzles is a little clearer. On the first there are 12 solvable cells, on the second there are only 3 solvable cells. Given that there are 53 blank cells on each puzzle, for the first puzzle 23% of the cells are solvable immediately versus only 6% for the second.

What Happens Next

That is not the end of the story! Every time a cell is correctly filled in, the puzzle changes. Counting the number of cells which can immediately be filled in gives you a starting place for scoring the sudoku's difficulty, but it is not the end of the process.

The method I use for scoring looks next at what happens after all the "first round" solvable cells are filled in. In the easy example above, the puzzle goes on to have 13 "second round" solvable cells, then 11 "third round" solvable, and so forth. Those cells are marked in the following image in rainbow order -- that is red for the first round, then orange, yellow, green, blue, and indigo.

rainbow sudoku

In reality, there is no way of knowing which cells a person will actually filled in after the very first. It may be that placing that first cell will allow you to thereafter get a "second round" cell; or it may be that the first cell does not get you any new solvable cells. That is almost a matter of luck. So the difficulty ranking that appears on each puzzle is, by necessity, imprecise. You cannot guess ahead of time where any correct move might be at any stage in the sudoku solving process.

An exception to this rule is on a very challenging puzzle, you might be reduced to looking for a single solvable cell out of 30 or 40 remaining. At that point, you might spend a while but eventually you will have to make that particular move!