bilibili taps ai to turn text and audio into video in minutes

AI Video & Visuals


tldrs;

  • Bilibili has introduced Codename H, an AI tool that generates video from text or audio in minutes.
  • The platform aims to capture the rise in demand for visual podcast formats, which increased 270% year-on-year.
  • Codename H significantly reduces production time and costs and opens the door for more creators.
  • With China's podcast audiences expected to reach 150 million, Bilibili has positioned itself for long-term growth.

Bilibili launched an ambitious new AI tool designed to transform the way content creators generate podcasts, converting written or spoken words into fully rendered videos in minutes.

The tool, internally called “Codename H,” aims to simplify video production by automating visuals using AI trained with text and audio input. Especially for creators who focus on podcasts and educational content, this allows them to unlock new audience segments drawn in visual format, while significantly reducing production time.

“Codename H” allows you to convert a 1,000-word script into a complete video in 6 minutes. Bilibili says this speed will be cut in half soon. The tool also offers a set of templates that allow users to easily convert traditional podcasts and articles into engaging animated video episodes.

Podcasting enters the visual age

This development will bring Bilibili to be at the forefront of the changing media landscape where podcasts are becoming increasingly visual. The platform has been actively courting podcast creators this summer, pointing to a strong internal number that reveals a surge in interest.

According to Bilibili, video podcast clock times on the platform reached 25.9 billion minutes in the first quarter of 2025 alone, exceeding 270% compared to the same period last year. Currently, over 40 million users are regularly tuned to this format. This growth reflects the broader trends seen across the industry where platforms like YouTube normalize the fusion of audio and video with podcasts.

Lowering the barriers to creation

What sets the “codename H” apart is how it changes not only the technical speed, but also the economy of creation. Tools that previously required editing and design days can now deliver videos that look professional in the moment, opening the door for creators who don't have the time, budget, or expertise to handle complex video workflows.

Bilibili's movement reflects the greater industry change in which AI not only increases creativity, but also automates most of the creative process. The result is productive democratization and a new field of voices is appreciated.

The company supports the tool with support strategies aimed at attracting new talent. This includes free recording venues and traffic incentives, ensuring that creators are equipped to not only create content but also effectively distribute it.

Strategic play in a growing market

The timing of the release coincides with forecasts from Iresearch Consulting, which predicts China's podcast user base will reach 150 million by 2025.

While audio-only shows continue to have a decline in relative influence, the visually enhanced podcasts have gained favour, especially among younger audiences looking for a more immersive experience. In Bilibili, “codename H” is more than just a tool. It bets on what comes next after media consumption.

As AI continues to evolve, the platform not only embraces it for backend optimization, but also places it at the heart of the content ecosystem. Bilibili's latest play suggests that the future of podcasting will be more visual, more automated and accessible than ever before.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *