Lemon Slice raises $10.5 million for AI video avatar

AI Video & Visuals


Lemon Slice has raised $10.5 million in seed funding to further develop its AI-generated real-time video avatars, further reinforcing its belief that chatbots are capturing faces and expressions.

The round is primarily led by Matrix Partners and Y Combinator, but also includes participation from Dropbox co-founder Arash Ferdowsi, Twitch co-founder Emmett Shear, and the artist investors behind The Chainsmokers.

Stack of lemon slices on professional flat design background with soft pattern.Stack of lemon slices on professional flat design background with soft pattern.

The startup pitch is simple. Most AI agents today are text boxes. The next growth curve comes from agents that look and act like people and characters in live video. Its new model, Lemon Slice-2, transforms images into customizable avatars that can answer questions, teach lessons, guide shoppers, and play brand characters during real-time streams.

Inside Lemon Slice-2, real-time video avatar model

Lemon Slice-2 is a “general-purpose video diffusion transformer” with 20 billion parameters that is designed to operate efficiently enough to livestream avatars at about 20 frames per second using a single GPU, the company says. The trade-off between scale and speed speaks to a large-scale optimization of inference, a gamble on whether the avatar will respond without an awkward pause.

It's available through an API for developers to plug into their applications, or through a drop-in widget that teams can use. Onboarding a single image allows users to switch backgrounds, styling, clothing, and transition from human-like to non-human characters without further training. Text-to-speech synthesis is backed by Eleven Labs, but the company says its avatars can access external knowledge bases in a way that ensures what's on screen is contextually relevant with the latest insights, rather than just lip-syncing.

The company's founders, Lina Colucci, Sidney Primus, and Andrew Weitz, started the business in 2024 with the belief that avatars would trip over the creepy line after just a few seconds of interaction. Their bet is that a fully connected end-to-end trained model will produce more natural microexpressions, eye gaze, and temporal continuity than pipelines devised to stitch disparate parts together.

Why investors are interested in live AI video avatars

Proponents point to the convergence of two forces: the “bitter lesson” that data and computing will triumph over AI, and consumer behavior that is already distorting video-first. Matrix's Ilya Sukhar argues that generalized AI technologies will outperform bespoke systems in each modality, and predicts this dynamic will hold true as video avatars scale.

Y Combinator's Jared Friedman also advocated for a viral-style video model to bridge the gap between photorealistic output and interactivity, a central friction that prevents the adoption of avatars in support, training, and creator tools. If avatars look cool and are fully responsive, or so the argument goes, they may finally be able to pass a practical version of the “Avatar Turing Test” for everyday use.

Lemon Slice raises $10.5 million for AI video avatar startupLemon Slice raises $10.5 million for AI video avatar startup

Crowded arena with clear stakes in AI video avatars

Competition is fierce. With the blossoming of video generation platforms like D-ID and HeyGen, talking head videos have become the next go-to for marketing and training. Avatar-centric players like Genies, Soul Machines, Praktika, and AvatarOS are chasing branded characters and coaching assistants. Lemon Slice's main difference, on the other hand, is that it is a general-purpose model designed to work with live interactive video (not batch-rendered clips) and with the ability to manipulate both human and non-human faces from the same image.

The timing may be advantageous. Sandvine's Global Internet Phenomena Report estimates that video has accounted for more than 65% of downstream internet traffic since its inception, while education research shows that users significantly prefer video explanations over thick text. By combining that desire with agent AI, you can incorporate customer support, tutoring, and product discovery into more conversational on-screen flows.

The company is already working on early use cases such as language learning, e-commerce coaching, corporate training, and interactive education. Brands can easily display informed on-screen guides, while educators can provide personalized feedback through helpful animated instructors. (One company was concerned about having a company mascot that wasn't human, so the platform is also open to the idea of ​​creating a personal representation of a “mismatched human” for you.)

Guardrails, Personal Information at Risk, and Compliance

As the realistic synthesis rate increases, the risk of exploitation increases. Lemon Slice claims to have built in checks to prevent unauthorized duplication of faces and voices, and uses large-scale language models to manage content in real-time. These controls will be considered as more regulators tighten expectations around biometric consent, deepfake disclosure, and provenance, all issues raised by groups such as Partnership on AI and debated in many jurisdictions.

Beyond policy, technological hurdles are also rising. Companies can offer low-latency general-purpose GPUs, but they still require strong lip-sync, eye contact, and maintaining identity from frame to frame. When avatars are used in regulated categories such as health coaching or financial services triage, benchmarks related to temporal consistency and consistency with scripts or generated text become as important as pure photorealism.

How funding will power Lemon Slice's next phase.

The startup, now a team of eight, plans to use the seed round to hire engineers and go-to-market, as well as plan for growth and figure out a budget to train and expand its adoption model. An API-first attitude conveys the role of the ecosystem. This means that developers pack avatars into their apps with a single line of code, then enhance features (language coverage, emotional control, latency reduction) based on real-world feedback.

If Lemon Slice can help keep avatars responsive even on modest hardware at 20 frames per second, while ensuring facial expressions look human-like and safety controls are robust, it could establish itself in the burgeoning interactive agent market. The next version of artificial intelligence could be a more intelligent AI that not only answers your questions, but also makes eye contact and reacts based on what you say.



Source link