by Dr Deniz Salali (see profile), extract from blog originally published in Scientific American.
When DeepMind’s AlphaGo program defeated its human competitor at the ancient board game Go, it made a big splash in the AI scene. AlphaGo was not trained through a set of prewritten instructions, but rather through practice and feedback. It turns out that there are striking similarities between new-generation machine-learning technologies and how children learn skills in the absence of formal education. Let me explain. Hunter-gatherer communities in Congo, where I do my field research, do not often give direct instructions when teaching their children. Instead, they create a learning opportunity, like providing a tool, and monitor the child’s action without interfering. The child then adjusts her behavior according to the feedback she receives based on her performance. Likewise, neural networks work by giving an opportunity for the machine to learn (i.e., input) and providing feedback based on the output obtained by the network structure. The ultimate goal in AI research is to generate artificial general intelligence (AGI), that is a machine that can understand and learn as we humans do. Many AI researchers, like the DeepMind team, believe that this will be possible through more independent learning strategies. In unsupervised learning, for example, machines learn by observing data without a predetermined goal or explicit guidance. This form of learning is parallel to how hunter-gatherer children learn most skills.
I have been visiting the Mbendjele hunter-gatherers, who are a subgroup of Bayaka Pygmies living across the Congo Basin, for the last six years. Mbendjele children develop skills like using knives, caring for infants and gathering wild plants as young as three years old. Our observations show that Mbendjele toddlers learn these skills mostly by freely exploring their environment, observing and copying others. Learning by being taught accounts for only a small portion (6 percent to be precise) of learning episodes that we observed.
I am interested in how we learn and transmit skills in the absence of formal education because understanding these can help us understand how complex cultural practices such as the Go game evolve. We have evolved a great capacity for learning by imitating others. This in turn, allows us to transmit information with great accuracy. Researchers have found that when information is transmitted faithfully, cultural practices remain in the population long enough so that they can be modified to generate more complex practices. This is how human culture progresses. Our cultural traits are built upon the legacies of the past information. But this means they are also restricted by them. While new training algorithms in machine learning have parallels with how human children learn, they have a capacity for surpassing human culture. This is because those new algorithms are not restricted by the legacies of our cultural history. In 2017, the DeepMind team introduced AlphaGo Zero, the new version of AlphaGo that became its own teacher by learning from self-play. Now it is considered the best Go player in the world. Human Go players have been building their game strategies on the 3,000 years of accumulated knowledge. AlphaGo Zero became the best Go player by setting itself free from this knowledge. What can we learn from the success of AlphaGo Zero? The program starts from scratch, plays against itself, then the outcome is combined with a powerful search algorithm to select the next move. Last year, the DeepMind team developed a general version of AlphaGo Zero that taught itself to play not only Go but also chess and shogi (a Japanese version of chess) and defeated the best programs specializing in these three games. The former world chess champion Garry Kasparov said “we can actually learn from the new knowledge the machines produce.” I think the way these programs work, through reinforcement learning from self-play, should also inspire us on how to raise curious pupils: allow them to explore great deals and provide feedback when it is needed.
The rest of the blog can be found here.
Comments