Until now, the way a neural network worked, was to supply it with
millions of pre-classified data, in the so called supervised learning
scheme, which resulted in neural networks only learning what we've
instructed them to do.
But there's also another technique, that of reinforcement learning where you let the AI discover by itself what it's supposed to do, without prior knowledge of its surroundings or any other data fed to it.
Microsoft was one of the first to employ this technique in a gaming environment, in trying to make a Minecraft character climb a virtual hill in the so called AIX Minecraft Project. In there, you let the algorithm explore the Minecraft world it was dropped in, let it freely move and interact with its surroundings, and force it to learn by rewarding it when it does something right so that it understands the goal of the game, the goal that it should be aiming for. Of course for us humans it's easy to see that we must climb that hill, or that when Super Mario touches a troll he instantly dies; but not so for an algorithm. Its strength instead, lies in the fact that it can try a billion combinations in the span of a microsecond in order to discover the same thing that humans intuitively had already in possession.
full article on i-programmer
But there's also another technique, that of reinforcement learning where you let the AI discover by itself what it's supposed to do, without prior knowledge of its surroundings or any other data fed to it.
Microsoft was one of the first to employ this technique in a gaming environment, in trying to make a Minecraft character climb a virtual hill in the so called AIX Minecraft Project. In there, you let the algorithm explore the Minecraft world it was dropped in, let it freely move and interact with its surroundings, and force it to learn by rewarding it when it does something right so that it understands the goal of the game, the goal that it should be aiming for. Of course for us humans it's easy to see that we must climb that hill, or that when Super Mario touches a troll he instantly dies; but not so for an algorithm. Its strength instead, lies in the fact that it can try a billion combinations in the span of a microsecond in order to discover the same thing that humans intuitively had already in possession.
full article on i-programmer
Comments