summary: Researchers found that AI models often fail to accurately replicate human judgments about rule violations and tend to make tougher judgments. This is likely due to the type of data these models are trained on. Rule violations are interpreted differently because they are often labeled descriptive rather than prescriptive.
This discrepancy can have serious real-world consequences, including harsher court rulings. Researchers therefore propose to improve the transparency of the dataset and match the training context to the deployment context in order to obtain a more accurate model.
Important facts:
- Machine-learning models often don’t exactly replicate human judgments about rule violations and tend to make tougher decisions, according to researchers at MIT and elsewhere.
- This discrepancy arises from the type of data used to train the model. Models trained on descriptively labeled data (identifying factual features) rather than prescriptively labeled data (evaluating rule violations) tend to over-predict rule violations.
- This difference in model performance can have significant real-world implications, including leading to more stringent judicial outcomes, highlighting the need to improve dataset transparency and align training context with deployment context. It has been.
sauce: Massachusetts Institute of Technology
To improve fairness or reduce backlog, machine learning models can be adapted to mimic human decision-making, such as determining whether a social media post violates harmful content policies. may be designed.
However, researchers at MIT and others have found that these models often do not replicate human judgments about rule violations. If a model isn’t trained on the right data, it can make different and often harsher decisions than humans.
In this case, “correct” data is data labeled by humans who have been explicitly asked if the item violates certain rules. In training, the machine learning model must be presented with millions of examples of this “normative data” so that it can learn the task.
However, the data used to train machine learning models are usually labeled descriptively. That is, humans are asked to identify factual features, for example, whether there is fried food present in a photograph.
When using “descriptive data” to train a model to determine rule violations, such as whether a meal violates a school policy against fried foods, the model tends to over-predict rule violations.
This loss of accuracy can have serious implications in the real world. For example, if descriptive models were used to determine whether an individual was likely to reoffend, the researcher’s findings could lead to more stringent judgments than humans would, resulting in , suggest it could lead to higher bail and longer prison sentences.
“I think most artificial intelligence/machine learning researchers assume that human judgments about data and labels are biased, but this result is even worse.
“These models cannot even replicate the already biased human judgments because the data on which the models are trained is flawed. If we knew that, we would label those features differently.”
“This has huge implications for machine learning systems in human processes,” said Marzyeh Ghassemi, assistant professor and head of the Healthy ML group at the Computer Science and Artificial Intelligence Laboratory (CSAIL).
Garsemi is the senior author of a new paper detailing these findings, which was published today. scientific progress. The paper also includes lead author Aparna Balagopalan, a graduate student in electrical engineering and computer science. David Madras, a graduate student at the University of Toronto. David H. Yang is a former graduate student and now he is the co-founder of ML Estimation. Dylan Hadfield-Menell, Assistant Professor at MIT. Gillian K. Hadfield, Schwartz Raisman Dean of the School of Technology and Society and Professor of Law at the University of Toronto.
label mismatch
This research grew out of another project investigating how machine learning models can justify their predictions. When collecting the data for that study, the researchers noticed that humans may give different answers when asked to provide descriptive or prescriptive labels for the same data.
To collect descriptive labels, researchers ask labelers to identify factual features. Does this text contain obscenities? To collect prescriptive labels, researchers give the labeler a rule and ask whether the data violates that rule, i.e. does this text contain obscenities? Ask if it violates the platform’s explicit language policy.
Surprised by this finding, researchers began user research to find out more. He collected four datasets to mimic different policies, including a dataset of images of dogs that might violate apartment rules against aggressive breeds. We then asked groups of participants to give descriptive or prescriptive labels.
In each case, descriptive labelers were asked to indicate whether three factual features were present in the image or text, such as whether the dog appeared aggressive. Their responses were then used to make decisions. (If a user says the photo contains an aggressive dog, they are violating our policy.)
The labeler was unaware of the pet policy. On the other hand, prescriptive labelers were given a policy banning aggressive dogs and asked if and why each image violated that policy.
Researchers found that humans are significantly more likely to label objects as violations in descriptive environments.
The differences they calculated using absolute difference in mean labels ranged from 8 percent for the image dataset used to determine dress code violations to 20 percent for dog images.
“We haven’t definitively tested why this happens, but one hypothesis is that people think differently about rule violations than they think about descriptive data. In general, prescriptive decisions are more lenient,” says Balagopalan.
However, data is typically collected with descriptive labels to train models for specific machine learning tasks. These data are often later reused to train different models that perform prescriptive decisions such as rule violations.
training trouble
To study the potential impact of reusing descriptive data, researchers trained two models to determine rule violations using one of four data settings. They trained one model using descriptive data and another model using prescriptive data and compared the performance.
They found that models trained using descriptive data performed worse than models trained to make the same decisions using prescriptive data. Specifically, descriptive models are more likely to misclassify inputs by incorrectly predicting rule violations.
And the descriptive model was even less accurate when classifying objects that human labelers disagreed with.
“This shows that data really matters. When training a model to detect rule violations, it is important to match the training context to the deployment context,” says Balagopalan.
It can be very difficult for users to determine how their data was collected. This information may be buried in the appendices of research papers or not made public by private companies, Ghasemi said.
Improving the transparency of your dataset is one way to mitigate this problem. If researchers know how the data was collected, they know how to use those data.
Another possible strategy is to fine-tune a descriptively trained model based on a small amount of standard data. Known as transfer learning, this idea is something researchers hope to explore in future research.
We also want to conduct similar studies with professional labelers such as doctors and lawyers to see if they lead to similar label discrepancies.
“The way we solve this is by transparently acknowledging that if we want to replicate human judgment, we should only use data collected in that environment.
“Otherwise, you end up with a system with very tight adjustments, much tighter than humans do. Humans recognize nuances and make other distinctions, but these The model doesn’t have that,” says Gassemi.
Funding: This research was funded in part by the Schwartz Leismann Institute for Technology and Society, Microsoft Research, the Vector Institute, and the Canadian Research Council chain.
About this artificial intelligence research news
author: Adam Zeve
sauce: Massachusetts Institute of Technology
contact: Adam Seewe – MIT
image: Image credited to Neuroscience News
Original research: open access.
“Judgement of fact, judgment of norm: Training machine learning models to judge humans requires a modified approach to data labeling.” Marzyeh Ghassemi et al. scientific progress
overview
Judging facts, judging norms: Training machine learning models to judge humans requires a modified approach to labeling data
As governments and industry look to expand the use of automated decision-making systems, it is imperative to consider how closely such systems can mimic human judgment.
We found that annotators labeled objects differently depending on whether they were being asked factual or normative questions, and identified core potential failures.
This calls into question the natural assumptions maintained by many standard machine learning (ML) data acquisition procedures. In other words, there is no difference between predicting a factual classification of an object and making a judgment about whether an object violates rules that assume those facts.
We found that using fact-based labels to train models aimed at making prescriptive judgments resulted in noticeable measurement errors.
We find that models trained using factual labels yield significantly different decisions than models trained using prescriptive labels, and that the impact of this effect on model performance is We show that the influence of other factors (such as the size of the dataset) that routinely attract the attention of ML researchers and practitioners can be exceeded.
