Machine learning helps flag dangerous messages on Instagram while protecting user privacy

As regulators and providers grapple with the dual challenge of protecting young social media users from harassment and bullying while also taking steps to protect their privacy, a team of researchers from four major universities is working to He proposed how to use learning technology to flag dangerous conversations. On Instagram without eavesdropping. The finding could open up opportunities for platforms and parents to protect vulnerable young users while preserving privacy.

A team led by researchers from Drexel University, Boston University, Georgia Tech, and Vanderbilt University recently published timely findings. This is a survey to understand which types of data input are most useful, such as metadata, text, and image features. A machine learning model for identifying dangerous conversations — Proceedings of the Conference on Human-Computer Interaction of the Association for Computing Machinery. Their findings suggest that dangerous conversations can be detected by metadata characteristics such as conversation length and participant engagement.

Their work addresses a growing problem with the most popular social media platform among American 13-21 year olds. Recent studies have shown that harassment on Instagram has led to a dramatic increase in depression among the youngest users, and especially an increase in mental health and eating disorders among her teenage girls.

“Platforms like Instagram are popular among young people precisely because they make their users feel safe enough to connect with others in a very open way, which leads to widespread harassment and abuse. It’s very concerning, considering what we currently know about bullying by malicious users,” said study co-author Dr. says.

At the same time, the Cambridge Analytica scandal and precedent-setting privacy laws in the European Union have put platforms under pressure to protect user privacy. As a result, Meta, which operates Facebook and Instagram, is rolling out end-to-end encryption of all messages on its platform. This means that the content of the message is technically protected and can only be accessed by those participating in the conversation.

However, this additional level of security makes it more difficult for the platform to employ automated technology to detect and prevent online risks. This is why groups’ systems can play an important role in protecting users.

“One way to address this surge of malicious actors on a scale that protects vulnerable users is through automated risk detection programs,” said Razi. “The challenge, however, is to design them in an ethical manner that is accurate but not privacy-intrusive. and privacy is important.”

A system developed by Razi and colleagues uses a multi-layered approach to machine learning algorithms to extract metadata profiles of risky conversations (for example, which are likely to be short and one-sided), whether they are images, links, etc. Combined with contextual cues from Sent. In tests, the program accurately identified dangerous conversations 87% of the time using only these sparse and anonymous details.

To train and test the system, the researchers collected over 17,000 private chats (400 in total) from 172 Instagram users aged 13 to 21 who had spontaneous conversations in support of the study. 10,000+ messages) were collected and analyzed. Participants were asked to reflect on their conversations and label each one as “safe” or “dangerous.” About 3,300 of the conversations were flagged as “unsafe” and further classified into one of five risk categories: harassment, sexual messages/soliciting, nudity/pornography, hate speech, and selling or promoting illegal activity. I was.

Using a random sampling of conversations from each category, the team used several machine learning models to analyze a set of metadata features (average length of conversation, number of users involved, number of messages sent , response time, number of images sent, etc.). And whether participants are connected or interconnected with others on Instagram, this is most closely associated with dangerous conversations.

This data allowed the team to create programs that could operate using only metadata. Some metadata is available when Instagram conversations are end-to-end encrypted.

“Overall, our findings present interesting opportunities for future research and implications for society. industry-wide,” the team reported. “First, performing risk detection based solely on metadata features allows for lightweight detection methods that do not require the expensive computations involved in analyzing text and images. Developing a system that does not mitigates some of the privacy and ethical issues that arise in this area and ensures that users are protected.”

To remedy this, the team is working to create a program that is even more effective and can identify specific risk types when users or parents choose to share additional details of conversations for security purposes. We performed a similar machine learning analysis for linguistic cues. Image features using the same dataset.

In this example, an advanced machine-learning program combs through the text of conversations to figure out which contacts the user has marked as unsafe, and to identify common words and phrases used in dangerous conversations. We identified combinations and they could be used to trigger conversations. National flag.

The team used a suite of programs to analyze the images and videos that are central to communication on Instagram. One is a program that can identify and extract text over images and videos, and the other is a program that can examine images and videos and generate captions. each image. Then, using similar text analysis, the machine-learning program again profiled words that showed images and videos shared in risky conversations.

A machine learning system trained using these dangerous conversation features was tested by analyzing a random sampling of conversations from a large dataset that was not used in the profile generation or training process. By combining both metadata characteristics with analysis of linguistic cues and image features, the program was able to identify dangerous speech with an accuracy as high as 85%.

“Metadata provides broad cues about risky conversations for adolescents. However, detection and response to certain types of risks require the use of verbal cues and image data,” they report. doing. “This finding raises important philosophical and ethical questions, given Meta’s recent efforts towards end-to-end encryption. Responsive cues are useful for well-designed AI-powered risk mitigation systems.”

The researchers acknowledge the study’s limitations as they only looked at messages on Instagram, but the system will analyze messages on other platforms that are subject to end-to-end encryption. may be adapted accordingly. They also point out that continuing training with more message samples could further improve the accuracy of the program.

However, this study proves that effective automated risk detection is possible, and while protecting privacy is a legitimate concern, there are ways to make progress and these common They point out that these steps need to be pursued to protect the most vulnerable users of popular software. platform.

“Our analysis provides an important first step towards enabling (machine learning-based) automation.”

In the future, it will be possible to detect risk behavior online,” they wrote. “While our system is based on the reactive properties of conversation, our study is likely to be more translatable in the real world given its rich ecological plausibility, risk detection. It also paves the way for a more proactive approach to

This research was funded by the US National Science Foundation and the William T. Grant Foundation.

Siza Ali, Cheng Lin, and Gianluca Stringini of Boston University. Seunghyun Kim and Moonmun De Choudhury of Georgia Tech. Ashwak Alsobai and Pamela J. Wisnifsky of Vanderbilt University contributed to this study.

Read the full paper here: https://dl.acm.org/doi/10.1145/3579608

Source link