Protect against the uncertain dangers of AI

Machine Learning


Developing machine learning systems that deliver safer, more robust, and fairer results must leverage fine-tuning, generalization, explainability, and uncertainty metrics.

Generative machine learning systems such as the chatbot ChatGPT and the image generation tool Stable Diffusion have aroused a sense of wonder and fear about the opportunities and risks of their widespread use. These systems offer a wealth of opportunities for significant productivity and efficiency gains.1. Large Language Models (LLMs) can be fine-tuned to deliver specific knowledge (LLMs deliver healthcare)2 and financial3 information has surfaced in the last few months), open-source LLMs will soon capture the know-how, tastes, and styles of individuals or organizations and provide low-cost services to serve as consultants, knowledge experts, or all-purpose assistants. can be fine-tuned with

Display prompts that require sensitive responses in OpenAI’s ChatGPT.

However, recent public debate is dominated by the unknown future capabilities of these AI systems, and the fear that they threaten human well-being while unwittingly pursuing their goals ( as demonstrated in the Paperclip Maximizer thought experiment). Anxiety about potential threats to survival is a hallmark of evolution (indeed, the invention of the printing press and the advent of computers led to fears of loss of control over the dissemination of information and the automation of a wide range of tasks). caused), such fears look like bugs in the context of what today’s systems can do. It generates information, typically in the form of text, code, audio, images, or video, and makes predictions based on learned patterns and contextual information.Four. Also, much speculation is required to assess existential threat potential. But the intersection of AI and human weaknesses, social incentives and evils opens up even more room for real danger.

In fact, we can all be fooled by the confident know-it-all nature of LLM output today. They cannot simply be dismissed as “probabilistic parrots”.Internal representation, as well as the emergent patterning of many physical, chemical and biological systems, from self-assembling crystals to protein folding to tissue morphogenesis4,5 Concepts, text snippets, image features, and all sorts of relationships between types of information emerge from training machine learning models. And, as with any complex natural system, it may be difficult or impossible to understand in detail how such emergence occurs. But despite “understanding” the language (in fact, LLMs can pass many high-level qualification exams without explicit training)6but now it’s failing a complex configuration task7), especially when trained on low-quality datasets, LLM can internalize spurious correlations and produce plausible lies. And unlike human know-it-alls, LLM is consistently accurate, useful for most tasks, readily available, and tweaked via reinforcement learning to instill adaptive guardrails to be kinder and more nuanced. You can give nuances (photo). human values ​​and preferences. Therefore, it is human nature to let our guard down and take the output of machine learning systems at face value. Still, we distinguish low-risk tasks and the most likely accurate results from high-risk situations and applications that require information at the forefront of knowledge or involve nuanced reasoning. need to learn how to But as demonstrated in a cross-sectional study that compared a doctor’s and her ChatGPT responses to patient questions, it can be increasingly difficult to separate truthfulness from human-like empathic communication. . Responses from chatbots were more empathetic and of higher quality.8.

Moreover, as content production becomes cheaper, mass production of content that distorts the truth or exposes or amplifies online harm for malicious purposes is unfortunately unavoidable. not. Malicious parties can also fine-tune and train open-source models through illicit means. At the very least, this is a threat to healthy public debate. Further research into the safety assessment of machine learning systems, the development of safety standards, and the development of governance and regulatory frameworks will facilitate thoughtful adoption and implementation of machine learning, and expert-curated content will not be believed to be false. You are expected to avoid being overwhelmed by unreliable information.

Yet even in a hypothetical future with widely trusted machine learning systems and internationally agreed regulatory standards and police, how can we ensure that our systems are robust and promote fairness? Is not it?

The robustness of pre-trained LLMs to unexpected inputs, especially adversarial attacks and “out-of-distribution” inputs, i.e. inputs from data belonging to a different distribution or domain than the training data, can be improved through continuous training.and with rapid engineering9. More generally, the robustness of machine learning model performance to out-of-distribution settings can be improved by domain generalization learning. In this learning, the model learns representations that capture immutable concepts and patterns shared across domains. And through causal representation learning, the model learns causal relationships between variables rather than simply capturing correlations. Another strategy, carried out by Shekoofeh Azizi, Alan Karthikesalingam, Vivek Natarajan and others, is published in this issue. natural biomedical engineeringcombines pre-training with supervised transfer learning (from natural images to medical images) and domain-specific contrastive self-supervised learning (a type of unsupervised learning that exploits similarities and differences in data) and task-specific fine-tuning. A combination of adjustments. Researchers demonstrate beneficial performance of this approach across multiple domains, tasks, and datasets in diagnostic imaging (dermatology, ophthalmology, digital pathology, chest radiography, mammography).

Fairness spans the realms of justice, morality and ethics. In medicine and healthcare, minimizing health disparities is concerned, and when it concerns the fairness of algorithms used in diagnosis, it can be quantified by differences in performance indicators such as false positive and false negative rates. In his Perspective, also published in this issue of the journal, Faisal Mahmood and colleagues review healthcare disparities and inequalities, highlighting biases in healthcare and healthcare machine learning models, which can arise from variability in acquisition and labeling of training data. possible) are discussed. Unintended changes in datasets and effects due to health correlates such as genetic ancestry and socioeconomic status are not easily affected by federated his-learning (distributed machine learning that maintains data privacy and security), representation learning, and modelling. It can be mitigated through explainability.

As Synho Do et al. argue in a Perspective in this issue, the unfair and unsafe outcomes of prediction failures by machine learning models can be mitigated by implementing prediction uncertainty metrics. . Prediction uncertainty usually arises from lack of generalizability to out-of-distribution settings, or from low-quality or noisy training data (due to difficulties or imperfections in labeling and annotation). Appropriate metrics depend on the model’s architecture and its application. For example, a diagnostic task might require a negative predictive value of 1 to have zero false negative tolerance. A research article by Dani Kiyasseh, Andrew Hung, and others, also included in this issue, provides another example. In developing a vision transformer for decoding surgeon activity from surgical videos, the researchers estimated the uncertainty in the classification of surgeon gestures as follows: Entropy of probabilistic output of various trained models. Also, for his current LLM, investigating how it answers the known and unknown may suggest ways to measure the model’s accuracy in expressing uncertainty.Ten. But of the unknown unknown, any danger is quite uncertain.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *