How fal.ai went from inference optimization to hosting image and video models

When it comes to AI and large-scale language models, whether it's data analysis or text generation, many enterprise use cases are text-based. However, increasingly, companies are relying on generating media platforms such as fal.ai to generate images, videos and audio for marketing campaigns, social media posts, etc. So, in this episode of the new Stack Agent, Alex and I spoke with Burkay Gur, CEO and co-founder of Fal.ai and Glenn Solomon, and Glenn Solomon, the renowned Capital managing partner and lead investor in Fal.ai's final fundraising round.

https://www.youtube.com/watch?v=njy7dksyzey

From machine learning infrastructure to generation AI

Today, FAL.AI hosts hundreds of models, including video models such as WAN Pro and Kling, as well as image models such as flux and audio models such as Minimax. But after all, when Gur (previously engaged in machine learning at Coinbase) and his co-founder launched Fal.ai in 2021, Generic Media wasn't even what they were focusing on. Instead, they were trying to optimize Python runtimes to speed up machine learning models.

“We actually thought that many of the really great companies over the past five years were focused on all of their computing and infrastructure,” Gur said. “We were looking at great people like Snowflake, Datadog, and we thought the next big trend would happen in Compute for machine learning.”

While at Coinbase, it became clear that machine learning tools and infrastructure of the time were essentially absent, so the team began working on it. However, the generator AI suddenly took off.

“As soon as we started, the entire space changed incredibly quickly with the releases of Dall-E, Dall-E 2 and ChatGpt. Essentially. We realised that this huge change in the market would lead to a massive shortage.

How speed and optimization have become a competitive advantage for fal.ai

However, the big newer models needed the power of the GPU to be useful. At the same time, these models will have a much broader user base than traditional machine learning models.

The team started with an image model and optimized it as much as possible. “I was pretty obsessed with how to optimize these models and how to run them more efficiently, and that's how I entered this market,” says Gur.

Today, the previously missing tools have been greatly improved. In many cases, multiple tools compete with each other in the same space. And the Fal.ai team quickly realized that optimizing the model alone would not make it a sustainable business model in the long run.

As Solomon pointed out, speed is important, but it is only part of the overall equation. “It's pretty difficult to become an optimization company,” he said. “There are a lot of optimizations that we do with FAL-hosted models. But the people we spoke to are talking on both sides of the market and where FAL sits. The model provider is the model provider, the other developer. It utilizes the latest and largest models and offers a great user experience.

Building for developers and non-technical audiences

Fal.ai offers both web frontends and API access to these models, and uses both, Burkay said. “One of the best things we've done, and what we're supporting with our product and engineering team is building a developer product, but it's very easy to use when the developers aren't developing, or when you're playing with a model, it's also very easy to experiment with some workflows, without diluting the developer and technical experience,” says Berkei. “After all, when you build truly great developer products, you build great products for non-technical people, and the definitions of them are still changing.”

As for Fal.ai's own infrastructure, Burkay noted that even today many of the infrastructures that are commonplace for building web-based products do not yet exist in AI workloads. So Fal.AI built its own inference engine, but also built custom CDNs, for example.

The cost of creativity and the cost of creativity

It's difficult to talk about these kinds of generative AI models without getting a bit of philosophical, so one question I had with Berkei was also about believing that the existence of these models would make sense for creatives such as photographers and filmmakers. Although Berkei acknowledged the controversy over this, he also believes that many people had visceral reactions to these new technologies a few years ago, but that is now calm.

“I think people are learning to live with it, especially the artists, the creatives, they actually use these things as tools,” he said. “My co-founder, Gorkem, is one of the really cool framing things [Yurtseven]discuss, is there this nuance of the costs of creativity and the costs of creation? And I think that's important to understand. These models simply reduce the cost of creating with an insane amount, such as 100,000 times, or 100,000 times, rather than creating something with VFX or professional video. But the cost of creativity is still high. The cost of creativity can even be argued that creations can even rise as they are highly available and ubiquitous. ”

You can find the rest of the conversation, including the overall generation media market, discussions on YouTube and podcast feeds.

Before joining the new stack as a senior editor for AI, Frederick was the enterprise editor for TechCrunch, covering everything from the rise of the cloud and the early days of Kubernetes to the advent of quantum computing.

Learn more about Frederic Lardinois

Source link