Yet another bastion of human
skill and intelligence has fallen to the onslaught of the machines. A new kind
of deep-learning machine has taught itself to solve a Rubik’s Cube without any
human assistance. The milestone is significant because the new approach tackles
an important problem in computer science—how to solve complex problems when
help is minimal. First some background. The Rubik’s Cube is a three-dimensional
puzzle developed in 1974 by the Hungarian inventor Erno Rubik, the object being
to align all squares of the same color on the same face of the cube. It became
an international best-selling toy and sold over 350 million units. The puzzle
has also attracted considerable interest from computer scientists and
mathematicians. One question that has intrigued them is the smallest number of
moves needed to solve it from any position. The answer, proved in 2014, turns
out to be 26. Another common challenge is to design algorithms that can solve
the cube from any position. Rubik himself, within a month of inventing the toy,
came up with an algorithm that could do this. But attempts to automated the
process have all relied on algorithms that have been hand-crafted by humans. More
recently, computer scientists have tried to find ways for machines to solve the
problem themselves. One idea is to use the same kind of approach that has been
so successful with games like chess and Go.
In these scenarios, a
deep-learning machine is given the rules of the game and then plays against
itself. Crucially, it is rewarded at each step according to how it performs.
This reward process is hugely important because it helps the machine to
distinguish good play from bad play. In other words, it helps the machine
learn. But this doesn’t work in many real-world situations, because rewards are
often rare or hard to determine. Researchers from the University of California,
Irvine. They pioneered a new kind of deep-learning technique, called autodidactic
iteration, that can teach itself to solve a Rubik’s Cube with no human
assistance. Given an unsolved cube, the machine must decide whether a specific
move is an improvement on the existing configuration. To do this, it must be
able to evaluate the move. Autodidactic iteration does this by starting with
the finished cube and working backwards to find a configuration that is similar
to the proposed move. This process is not perfect, but deep learning helps the
system figure out which moves are generally better than others. Having been
trained, the network then uses a standard search tree to hunt for suggested
moves for each configuration. The result is an algorithm that performs
remarkably well. This has implications for a variety of other tasks that deep
learning has struggled with, including puzzles like Sokoban, games like
Montezuma’s Revenge, and problems like prime number factorization.
More information: