ID:3564
 
A few weeks ago I saw a puzzle in the local newspaper called sudoku, in which you're given a 9×9 grid with some numbers filled in, and your goal is to fill it completely with the digits 1-9 in every column, every row, and every 3×3 block without repeating. (The name is Japanese, but really it comes from the US, from Dell which originally called it Number Place. From there it became a hit in Japan, and then it came around again to the US more popular than ever.) I've been playing the evil level at websudoku.com.

I'd been wondering for a while what it would take to build an engine to solve or generate the suckers. It turns out the former is easier; you need to be able to tell if your puzzle is unique and solvable from the hints provided. However, I learned today that solving sudoku is more complicated than I ever imagined.

The evil level I'm used to apparently isn't quite evil enough. Aside from the common and simple processes of elimination in the 3×3 blocks, and further elimination in each column or row as needed, it also often has what are called naked or hidden pairs--where 2 positions can only have 2 particular numbers (naked) or 2 numbers can only have 2 particular positions (hidden). It also has variants of those that extend past 2 and into 3, 4, etc. I thought that was all there was to it.

In puzzle a week or two ago I did find one trick that seemed to be the only way out of a stalemate, though; later I figured I must have missed something and chalked it up to finding the long way around. However this technique is called X-Wing, which I learned today, and it's basically this: If you have a number that can only go in the same two rows within 2 different columns, you can eliminate that number for anywhere else in those rows; it works the same if you switch the columns and rows. There's a version of this for 3+ places, too, which is called Swordfish, and as the name implies it's as painful as half of John Travolta's movies.

But that's not the worst part. The worst part is that there are some puzzles so tricky, so insidious, that they can't be solved by X-Wing or Swordfish. They require other techniques, including one called coloring, which is related to a more complex technique called forcing chains; both of these veer perilously close to mere trial and error. To write a solver for those techniques is, well, almost unimaginably difficult.

Anyway, now my brain's fried from taking in way too much information, and trying to verify that yes, certain puzzles can't be solved without the above techniques. It's not for the faint of heart, which apparently includes me.
looks interesting.
Go watch the episode of popcultured when they talk about this, very funny.
My mom is hooked on these, but I personally dislike them because they were given as homework in my fourth or fifth grade math class. Of course, you know half the poeple who do those are just guessing -_-
It's all the rage in England with middle-aged women.
Math. It's not just for geeks anymore.

Well, actually it is.... ;-)

So are you working on an implementation to act as a generator/solver? Because that would be a spiffy little BYOND project.
Heh, it's funny how coincidences happen. I learnt about Sudoku a while ago when I saw one in the local paper. My first thought was: "Gee, it'd be interesting to write a program to solve those."

Perhaps predictably, one of the questions in the ProgComp05 programming competition I just entered (see my blog) was based on a simplified version of Sudoku (with 6 regions containing 6 squares each, instead of 9 regions with 9 squares). We only had to solve "simple" ones, where the solution could be arrived at with no back-tracking or trial-and-error - that made it quite a bit easier!

And finally, I did my first Sudoku puzzle about 5 hours ago - today's issue of the only Sydney-specific newspaper (which I only bought because I was in Sydney for the ProgComp) came with a whole booklet full of the things. Enough to keep you going for months!
Yes, I am working on an implementation. To generate, though, you have to have a solver. And to generate anything useful, you need a human-style solver as well as a fast one.

So far my approach only began to have success when I used an algorithm called Dancing Links (DLX) developed by Donald Knuth, which reformulates Sudoku into a binary matrix of 324 columns, 729 rows, and tries to find 81 rows (including existing clues) for which each of the 324 columns has only a single 1. It's an elegant algorithm and uses backtracking, with which it can tell you if a grid is insoluble or if there are multiple solutions. Using Dancing Links alone I can add a numbe of clues to the puzzle, look for insolubility and remove clues if need be, then add clues as long as there are multiple solutions. Then I shuffle the clues and remove them one by one, replacing them again if the board has multiple solutions as a result.

The problem with this approach is that while it gives you a valid sudoku puzzle with a unique solution, most of that time the solution requires messy trial-and-error techniques to solve. At best such boards should be reserved for only the clinically insane--present or future. Today I finally got a human-style solver working using most of the common techniques except trial-and-error ones, and used that as a further guideline. (My first attempt at a human-style solver was a disaster. This one is based more on dancing links.) Now, new clues are added as long as the puzzle is not solvable by those techniques, and then removed at random as long as the result doesn't screw with that constraint.

Well now I face an even weirder problem. Rating the resulting puzzles for difficulty and checking out how the human-style solver would solve them, I found that an alarming number of puzzles are solvable with only the dirt-simple naked single and hidden single techniques. In other words, most of these puzzles would be categorized as easy, and some medium at best. Running an Evil level puzzle from websudoku.com through my difficulty rater I found a much higher degree of difficulty than anything my generator typically spits out (and yet, the Evil level turns out to be not so evil after all!).

Originally I planned to generate grids en masse and just sort them by difficulty. However it seems more likely now that a better approach will be to tweak the difficulty. I hope to do this by adding a random clue, then removing the other clues at random as per my earlier method. Or I may choose to remove one at a time and only take out the one that increases difficulty the most, and then remove any others that are expendable.
OMFG I DID ONE OF THOSE DAMN THINGS FOR MATH THE SECOND DAY OF SCHOOL
Arr, lower yer volume there matey. Ye be scarin' off the fish.

An' sudoku ain't no math problem. It's logic. What kinda squiffy deskhand would put it in a math class?
2 + 2 = 5, Lummox.

Using logic with maths helps!
What kinda squiffy deskhand would put it in a math class?

A sadistic one. We've had to do, like, 10 of these in Maths.
Please. No. More.
I've played this a few times (also sparked by seeing one in a newspaper), and I've noticed that the puzzles I've seen so far have had rotational symmetry in the layout of their clues (like the symmetry of a crossword puzzle's black boxes)

I'm not sure if that's a "rule" of Sudoku layouts, but it seems to be a common practice...

Something to keep in mind for your generator (on the off chance you missed it)