How the New York Times uses custom AI tools to track the ‘manosphere’

In July 2025, the Department of Justice announced that it would not release any additional files from its investigation into child sex trafficker Jeffrey Epstein. The backlash against the decision was swift, and it even came from unexpected corners of the internet.

A chorus of right-wing commentators and influencers have publicly criticized President Donald Trump and his administration for failing to follow through on a campaign promise to release federal documents. Political podcasters who supported Trump during his re-election campaign were outraged, and social media figures like Joe Rogan and Andrew Schultz publicly pressured the administration to change course.

The New York Times closely tracked this growing discontent across the Republican base for months, culminating in the Epstein File Transparency Act, which Congress passed almost unanimously last November. AI-generated reports were delivered directly to journalists’ email inboxes and were an essential tool for Times reporting. It was also one of the first signs that conservative media outlets were starting to turn against the administration, said Zach Seward, the Times’ editorial director for AI initiatives. (Seward was previously an associate editor at the Nieman Institute.)

Built in-house and known internally as Manosphere Reports, the tool uses large-scale language models (LLMs) to transcribe and summarize new episodes of dozens of podcasts.

“The Manosphere report gave us a very quick and clear signal that things were not going well in that part of the president’s base,” Seward said. “There was a direct connection between seeing it and actually covering it.”

Broadly speaking, the “manosphere” includes online communities that promote narrow or patriarchal definitions of masculinity, as well as misogynistic and anti-feminist views. It often overlaps with the MAGA and far-right social media ecosystems. After Donald Trump’s reelection, more coordinated coverage of the manosphere became a priority across the Times.

“To properly cover this administration, it seemed important to focus on influencers, primarily conservative young male influencers, among many other sources,” Seward told me. “It turns out there was enough specific demand and enough broad interest.” [in the newsroom] We thought it made sense to automate that sending. ”

Launched a year ago, Manosphere Report now follows about 80 podcasts hand-picked from the desks of Times reporters covering politics, public health and internet culture. This includes right-wing podcasts such as The Ben Shapiro Show, Red Scare with “Dimes Square” shock jocks Dasha Nekrasova and Anna Khachiyan, and The Clay Travis & Buck Sexton Show, the successor to Rush Limbaugh’s talk radio show. It also monitors the Huberman Lab, a podcast hosted by Stanford University neuroscientist Andrew Huberman, which has been criticized for spreading health misinformation. Seward noted that the report also included some liberal-leaning programs, such as the anti-Trump podcast “Meidas Touch,” which has a predominantly male audience.

When any show releases a new episode, the tool automatically downloads it, transcribes it, and summarizes the transcript. The tool collates these summaries every 24 hours and generates a meta-summary that includes shared talking points and other notable daily trends. The final report is automatically emailed to journalists each morning at 8 a.m. ET. The Times is exploring how to use this workflow to launch similar AI-generated summary reports for other beats.

Seward said the emails notify him when there is a rise in sentiment or a shift in rhetoric across the manosphere. At the end of the day, it’s up to Times journalists to follow up on the clues they find in their reports and tell the story.

“We never rely on [solely] “Reporters are going back and hearing the real story,” Seward said of the AI-generated summaries. [podcasts] But this report is basically used as a kind of hint, or a hint to take a closer look at something. ”

For example, when actress Sidney Sweeney’s American Eagle ad became a flashpoint in the culture wars last summer, Times reporters realized, in part through reporting, that right-wing podcast figures were shaping the backlash. Further analysis revealed that these commentators helped make the issue controversial in the first place. Reporters found that podcasters were talking about the progressive uproar against Sweeney when there were only a few thousand posts about the X ad.

The Times is not the first news organization to turn to LLM to parse the troves of audio and video material on the Internet that journalists are expected to consume to stay informed. Local news organizations across the country use LLM to monitor live streams of school board and town hall meetings through email summaries. Last year, my colleague Neil covered Roganbot. This is a tool created by AI consulting lab Verso to generate a searchable transcript of The Joe Rogan Experience podcast. Among other features, this tool suggests potentially controversial or false statements for fact-checking.

The Manosphere Report was built by the Times’ AI Initiatives team, a small newsroom division launched in 2024. While other major newsrooms in the U.S. have considered using AI to build chatbots for readers and help draft and edit stories, the AI Initiative team has primarily focused on using generative AI for data analysis and investigative reporting in the newsroom. The team also built other tools to scale up and operate LLM’s more basic use cases, such as transcription and summarization.

Seward said Manosphere Report is an offshoot of one of the existing tools called Cheat Sheet.

The tool started as a one-line script on the laptop of Dylan Freedman, a machine learning engineer and AI project manager. Times investigative reporter Jesse Drucker approached Friedman with a list of 10,000 people who had signed up for tax breaks available to Puerto Rican residents.

“He said I can’t Google 10,000 people, but of course a machine can,” Seward said.

Using LLM, Friedman was able to automatically Google names, examine the results, and identify people whose financial history was worth further investigation. The tool can assess signals that someone is important, such as whether they have a cryptocurrency-related job or been involved in a lawsuit. As a result, findings published in May 2024 revealed widespread abuse of tax breaks.

“That was the first light bulb,” Seward said. From there, the initiative team began experimenting with other applications of LLM to handle large, messy datasets and file dumps on a case-by-case basis. Many of them currently exist within a single spreadsheet-based tool. Reporters can drop datasets into Cheatsheet and run various preset scripts and prompts. Each menu feature is called a “recipe.” Some of those recipes, including transcriptions and transcript summaries of thousands of hours of video footage, are the basis of Manosphere Reports.

Still in beta, Cheatsheet has already been tested with about 300 newsroom users, of whom Seward said 50 are “really active users.” Currently, Cheatsheet creates at least one new project every day. The tool has been used to investigate election interference groups, transcribe and translate prison records in Syria, and find recent instances where President Trump spoke about January 6th. At times, Cheatsheet was also used to more thoroughly analyze the history of podcasts.

Last spring, the Times investigated medical claims made by Dr. Mehmet Oz, who was tapped by the Trump administration to head the Centers for Medicare and Medicaid Services, throughout his career as a television and social media personality. Using Cheatsheet, reporters were able to analyze Oz’s statements across 2,500 media appearances, “Dr. Oz Show” clips, and social media posts. The investigation found that Oz had financial ties to some of the products he promoted on the air, including products with little scientific evidence of their health benefits.

In February, the cheat sheet will be rolled out to all journalists in the Times newsroom, Seward confirmed to the Nieman Institute. Staff will learn how to use it in an optional training session offered this year by the initiative team.

Like Manosphere Report, Cheatsheet is rooted in the philosophy that creating new text and images for publication is not the most effective use case for generative AI in a newsroom like the Times. Rather, Seward sees the technology as a way to enhance newsrooms’ existing investigative capabilities.

“The Cheat Sheet is replicable, and we hope one day we can open source it. That technology is not a differentiator or competitive advantage for us,” Seward said, arguing that the tool is instead a multiplier on the Times’ beat reporting. “The reason we’re building this is because it helps us double down on our existing competitive advantage, which means we’re more likely to have 500 hours of recorded video leaked in the first place.”

Photo of the entrance to the New York Times building in Manhattan, New York. Used via Adobe Stock license.

Source link