What is Reinforcement Learning? | The Motley Fool

Reinforcement learning is a branch of machine learning used in artificial intelligence. Agents are rewarded with positive values for achieving desired outcomes, and negative values prevent unwanted actions.

In this article, we will discuss reinforcement learning, explain how it differs from other types of machine learning, and review common algorithms used in reinforcement learning. Finally, we review the strengths and weaknesses of reinforcement learning and show an example of its use in a vintage video arcade game.

Image Source: Getty Images.

Types of machine learning

Artificial intelligence (AI) agents can learn, but it takes time and effort. An important part of training AI agents involves machine learning. Reinforcement learning is an important branch of machine learning that assigns positive or negative values to outcomes. AI agents are programmed to maximize their assigned values and calculate the most efficient way to get the desired result.

Reinforcement learning is just one area of machine learning. Additional work is required for programmers to define clear goals and positive and negative result values. However, once programmed, the algorithms work independently.

Related forms of machine learning that are often confused with reinforcement learning include:

Supervised learning: Algorithms use labeled data to achieve desired results. One example is image recognition. Algorithms are only as good as the attributes of your data. With enough information, agents can distinguish common characteristics of related predefined forms.
Semi-supervised learning: Developers use an intermediate approach, providing agents with a relatively small set of labeled data and a larger set of unlabeled data. This algorithm is developed to infer information from labeled data and use it to draw conclusions about large data collections.
Unsupervised learning: Algorithms have the freedom to manipulate unlabeled data to observe features and draw their own conclusions.

Reinforcement learning involves leaving the agent to its own device once the parameters are established, so it can often be thought of as similar to semi-supervised learning. However, the main difference is that the level of explicit programming is higher in reinforcement learning.

General Reinforcement Learning Algorithm

Reinforcement learning is not that different from human learning in that it uses different approaches to achieve its goals. People are generally guided by teachers. Reinforcement learning is controlled by algorithms. There are many algorithms developed for reinforcement learning, but the three most common ones are:

state-behavior-reward-state-behavior: The algorithm provides the agent with a parameter that gives the probability that a particular action will result in a positive value.
Q-learning: Agents are free to explore the environment without policy and draw their own conclusions about its contents.
Deep-Q network: The algorithm uses neural networks and reinforcement learning techniques to obtain results based on a random sample of previous positive values achieved by the neural network.

Strengths and weaknesses of reinforcement learning

Like any other area of artificial intelligence, it has its strengths and weaknesses. Some of the benefits of reinforcement learning are obvious. Like other AI capabilities, it can be used to solve highly complex problems, such as those involving decision-making, control, and optimization.

Reinforcement learning can also handle environments where outcomes are not always easily predictable, a feature that is particularly useful for real-world applications such as medicine. It can correct mistakes made during the shakedown process and can be combined with other areas of machine learning to improve performance.

Of course, there are also drawbacks. Reinforcement learning is not very useful for dealing with simple problems. It requires a lot of data and can be very difficult to debug if something goes wrong. Finally, it largely depends on the quality of the explanation of the positive values. An agent can fail if the description is not well designed.

Pac-Man and Reinforcement Learning

It’s hard to imagine the field of AI being tied to the 1980 program considered to be the foundation of one of the greatest video games of all time. Again, it’s hard to deny Pac-Man’s once-ubiquitous cultural appeal. In fact, this game structure is used in many university computer science syllabuses to help students understand the power of reinforcement learning.

Video game algorithms can be described as a type of reinforcement learning. The environment is a grid where Pac-Man eats pellets while avoiding ghosts. Certain outcomes, such as finishing a level, are assigned positive values. Others are given negative values, such as Blinky, Pinky, Inky, and being eaten by Clyde.

With enough iterations and the assistance of a well-designed Deep-Q network algorithm, reinforcement learning can classify an almost infinite number of pixel combinations to achieve the desired result. In this case, red, pink, cyan or orange blobs.

The Motley Fool has a disclosure policy.

Source link