Machine Learning: Fundamentals | S&P Global

Risk notes

Through this series of articles, we aim to foster a conversation about risk: the risks that arise from the technologies that power AI, and the risks that arise from the use of such technologies in industrial applications. Essentially, machine learning has the ability to learn from patterns in data, make decisions based on those patterns, and complete assigned tasks.

We have explored the concepts of “supervised learning” and “unsupervised learning”, which refer to targeted activities and resource allocation rather than risk management. For example, supervised learning is commonly used to classify or predict data, while unsupervised learning is used to understand relationships in data to facilitate prediction (or for quantitative research purposes). Supervised learning is more resource intensive because it requires the data to be classified or “labeled”. Labeling can be performed by another algorithm or a human, but typically performance is best when “humans are involved”. Although algorithms have made great strides in performance and capabilities, preventing poor decision-making is something that can only be achieved with careful thought. Similarly, human decision-making, especially “gut checks”, is complex and difficult to model to ensure reliability.

One of the fundamental risks to manage in these cases is bias. Bias is the result of a model trained on data that biases decisions toward one particular outcome. A particularly problematic example is the use of machine learning to make credit application decisions based solely on historical observations of previous decisions. Such an approach risks reinforcing past problems with unfair credit decisions by learning from and perpetuating historically enacted biases. Conversely, machine learning can also be used to identify bias and discriminatory and unfair practices, potentially leading to advances in ensuring fairness and fairness in future decision-making.

Another common and important risk category to manage with machine learning and deep learning is overfitting and underfitting. Overfitting occurs when a machine learning model relies too much on past events that do not represent the reality of use when making predictions. Conversely, underfitting occurs when there is too little data or data that does not fit the model results particularly well. The “past performance is no guarantee of future results” disclaimer used throughout the financial industry essentially describes the problems of overfitting and underfitting, but these challenges can be managed can. Techniques to avoid overfitting and underfitting include selecting high-quality data, using cross-validation, and applying regularization and hyperparameter tuning.

As this research series expands, we will address specific risk-related questions arising from the application of these technologies in specific contexts. This, we believe, is a clear and actionable way to provide insight into such risks, bounded by a consistent frame about the specific risks to be managed.

Machine learning algorithms are considered the “core” of artificial intelligence applications. They serve a fundamental function and their importance in many downstream processes requires care to ensure a properly balanced 'risk management' outcome. Machine learning's rich and storied history also involves a balance between great leaps in technological advancement and expectations set against societal questions and a relative drought in research funding.

A brief history of AI, from the Turing Test to today's Transformers

Discussions about the history of AI often refer to the “Turing Test,” a thought experiment devised by (and named after) British mathematician and computer scientist Alan Turing. The test considers one of his methods for determining whether machines can exhibit human-like intelligence. Specifically, we are looking at the ability of machines to generate language-based responses that are indistinguishable from human responses. In the scenario envisioned by Turing, scientists would ask humans to have typed conversations without knowing whether they were communicating with another human or a machine. If a human believed that he was talking to a human when in fact he was talking to a machine, that machine would pass the Turing test.

The modern concept of AI is deeply rooted in Turing's operational definition of machine intelligence. In other words, rather than trying to directly answer the question of whether machines can “think”, we base our research on how convincingly machines can reproduce human-like results when performing tasks. It's about determining success. ” More than 70 years later, this framework allows us to better understand the excitement that arose when GPT-3.5 (also known as ChatGPT) was released for public trials. This technology uses a unique application of machine learning fundamentals and advances in modeling to produce highly convincing results through the parameters of the Turing test.

In the 70 years since Turing's time to today, several major stages of machine learning development have led to the current precipice of AI ubiquity.

Many of the foundations of machine learning were established in the 1960s and 1970s. The realization that handcrafting all computing rules would become unsustainable led computer science to move toward teaching computers to “learn” from data. At that time, the concept of “neurons” was introduced and early perceptron algorithms were developed. Simply put, a perceptron is a type of network in which each input is interconnected and assigned a weight indicating the strength of each connection. Changing the weighted connections will produce different outputs.

Despite these advances in computing technology, there was a brief decline in funding and interest during the 1970s and 1980s, beginning a period known as the “AI winter.” Still, development continued and the algorithms became more and more powerful. These include his two types of algorithms, now famously called “neural networks”, developed in the 1980s: neural networks with backpropagation and recurrent neural networks. Neural networks are algorithms that teach computers how to process data in ways that mimic the neural processes of the human brain. In his 1980s, when this concept was developed, the work was primarily theoretical, as it was difficult to obtain sufficient data and gather adequate computing resources to develop these concepts in this field. It was something.

Almost 15 years later, the digital age has sown the seeds for further advancements in machine learning models, making them more viable and increasingly essential. In this phase, ensemble methods were developed to stack multiple machine learning models. Important examples include “random forests” and “support vector networks.”

During the 1990s and 2000s, the advent of “big data” and advances in computing power led to major advances in machine learning and AI, particularly neural networks. The new process of machine learning is called “deep learning” (because the architecture is deeper and more complex, with multiple hidden layers between inputs and outputs), and it is based on data processing at a deeper, more precise, and more flexible level. processing is possible. Benchmarks that have been fixed for decades have improved dramatically in nearly every traditional application, including machine translation in natural language processing and image classification in computer vision.

More recently, the Transformer architecture (encoder-decoder model) has been adding to a growing list of “killer apps” since it was introduced in a paper titled “Attending Is All You Need” by researchers at Google Labs in 2017. Transformers became the basis for many subsequent models in natural language processing, including BERT, T5, and GPT.

Source link