How AWS built AI-powered live vertical video capabilities for Fox Sports using AWS Elemental Inference

AI Video & Visuals


For media companies, quickly distributing highlights to social platforms is a competitive advantage that directly drives viewership and monetization. With hundreds of live events across multiple leagues, many organizations lack the resources to manually crop and reformat key moments from every game. Big moments will still happen, but without a scalable way to capture and deliver them, the opportunity to engage fans and grow your audience won’t be realized. AWS Elemental Inference solves this challenge, allowing media companies to capture the excitement of the moment and share it with fans as it happens.

Fox Sports collaborated with AWS to leverage AWS Elemental Inference, a fully managed AI service that automatically detects key moments and transforms live and on-demand broadcasts into vertical video content optimized for any platform. Together with AWS Elemental MediaLive, AWS Elemental MediaPackage, and AWS Elemental MediaConvert, this new service was used to automatically detect key moments from live sports broadcasts, extract video clips, reformat them for a 9 minute 16 second vertical social platform, and display them on a review portal. All of this happens within seconds of the on-air moment occurring. This blog describes the prototype solution that AWS built, including the architecture, key design decisions, and business logic behind each component.

Solution overview

The traditional workflow for digital content creators working on live sports includes monitoring the broadcast across multiple screens, manually identifying key moments, clipping, reformatting for vertical social platforms, and distributing to various channels. This solution brings together that workflow into a single automated experience. Throughout the broadcast, video clips of key moments will automatically appear on the web portal. It is detected, extracted, and verticalized within 20-30 seconds of the on-air moment occurring. From there, editors can review, search, tag, edit, and distribute finished content to social channels in one integrated workflow and application.

Three key components make this experience possible:

  1. A user-facing web application provides an interface for reviewing, editing, and distributing content.
  2. The automatic clip collection pipeline handles the discovery-to-clip lifecycle.
  3. The video editing and download pipeline supports post-processing workflows such as trimming, stitching, and MP4 export.

This solution combines AWS Elemental services for media-specific services and a serverless backend for orchestration and storage. AWS Elemental MediaLive ingests and encodes live broadcasts. AWS Elemental MediaPackage V2 prepares live streams for clip extraction by segmenting content and maintaining a rolling window of recent footage that the system can collect on demand. AWS Elemental Inference analyzes videos in real time, detects key moments, and produces vertically cropped output for social platforms. AWS Elemental MediaConvert handles video transcoding for editing and download workflows. Serverless layers (AWS Lambda, Amazon EventBridge, Amazon Simple Queue Service (SQS), Amazon DynamoDB, Amazon Simple Storage Service (S3)) tie everything together with event-driven processing and durable storage.

Web portal experience

The web portal provides digital content creators with a single workspace to manage the entire lifecycle of live sports highlights. Operators create live events from the home screen and watch incoming clips appear in near real-time. Clips come with AI-generated tags and descriptions, making it easy to identify and categorize moments. Editors can filter clips by event, search across metadata, and add custom tags to quickly and intuitively find the right clips to share across social platforms.

Selecting a clip opens a modal for instant time-shifted playback directly from the source stream. The built-in video editor provides a visual timeline that displays individual HLS segments and allows editors to perform trim, split, delete, and merge operations directly in the browser. A new version of the asset is created each time you edit, so your original clip is always preserved. Editors can iterate through multiple versions without risking overwriting the source material.

Example video editor view with time-shifted playback

Example Video Editor View with Time-Shifted Playback – Courtesy of Fox Sports

The Reel Builder allows operators to select clips from different events and time periods, arrange them in the desired order, and trigger the generation of highlight reels. Real-time status tracking on the interface allows you to process multiple reels simultaneously. When content is ready for distribution, editors can batch trigger MP4 downloads for any combination of clips and reels and share the resulting download links with colleagues.

Each clip and reel also includes a feedback form where editors can submit ratings and comments. This feedback is stored against assets in the backend, giving your team a structured way to understand editorial preferences and iterate on output quality over time.

Reel builder working exampleReel builder working example

Clip collection pipeline

During a live broadcast, AWS Elemental Inference continuously analyzes the video stream and detects key moments within single-digit seconds. Each detection sends an event to Amazon EventBridge. The event includes a PTS (presentation timestamp) value, descriptive tags such as “goal” and “celebration,” and an AI-generated description of the moment. This event triggers the harvest pipeline to extract the corresponding video segments from the live stream through AWS Elemental MediaPackage V2 and store them in Amazon S3.

The following is an example of an event emitted by AWS Elemental Inference.

{
    "version": "0",
    "id": "94277336-b607-61f6-4bb2-2f21453862a5",
    "detail-type": "Highlight Metadata Generated",
    "source": "aws.elemental-inference",
    "account": "123456789012",
    "time": "2025-11-01T02:22:09Z",
    "region": "us-west-2",
    "resources": [
        "arn:aws:elemental-inference:us-west-2:123456789012:feed/abcdefg1234567890"
    ],
    "detail": {
        "timescale": 90000,
        "startPts": 158576733129000,
        "endPts": 158576733849000,
        "description": "Goalkeeper makes a diving save to deny a close-range shot",
        "tags": [
            "save"
        ],
        "callbackMetadata": "highlight-metadata"
    }
}

After the collection job completes, the AWS Lambda function validates that the extracted videos contain actual content. Empty or failed harvests are automatically cleaned up. If the collection is successful, the clip metadata in DynamoDB is updated with the final status and end-to-end collection time.

Service diagram showing harvest pipeline

Service diagram showing harvest pipeline

As we saw above, Elemental Inference events contain rich metadata, but do not inherently identify which live event or game the highlight belongs to. This solution addresses this with an “active event” pattern. Operators activate one event at a time from the web portal, and each time a new highlight arrives, the collection pipeline queries DynamoDB for currently active events. Incoming clips are stamped with the ID and name of the event. This design is intentionally simple and matches real-world workflows where operators are always focused on a specific broadcast.

Video editing and download pipeline

Both the video editing and MP4 download workflows follow the same asynchronous processing pattern, but use separate queues and Lambda functions to avoid resource contention. In either case, a user action in the portal triggers an API call that places the job into an SQS queue. Processing Lambda takes the job, performs the video operations, stores the output in S3, and updates the job record in DynamoDB. Users can monitor progress from the portal and stream or download results once processing is complete.

Architectural diagram of the video editing and download pipeline process

Architectural diagram of the video editing and download pipeline process

The Clip Editor supports trim, split, delete, and merge operations, and the editor applies these operations through the web portal’s visual timeline. When the editor saves the changes, the operation is sent to the processing pipeline. The processor retrieves the source HLS segment from S3, applies the requested operations using AWS Elemental MediaConvert, and uploads the results as a new versioned asset.

For content distribution, editors can trigger MP4 downloads for multiple clips and reels simultaneously. The download pipeline uses Elemental MediaConvert to convert the HLS content to MP4 format and generates a pre-signed S3 URL for retrieval. Once the download job is complete, the resulting URL can be shared and reused by other team members without retriggering the conversion. This avoids redundant processing and makes it easier to distribute finished content across your editorial team.

result

With this solution, editors no longer have to switch between multiple tools or coordinate handoffs between teams. Instead, you can work within a single portal, monitor incoming clips with clear status indicators, see AI-generated vertical crops side-by-side with your source feed, and move content through your editing workflow. All of this takes place while the broadcast is still live.

The prototype solution built for Fox Sports shows how AWS Elemental services, combined with a serverless, event-driven architecture, can reduce the time from live moments to socially enabled content in seconds. For media companies looking to expand their digital coverage across more events and leagues, this approach frees up content that was previously too costly or resource-intensive to produce.

“AWS Elemental Inference demonstrates the power of the Fox Sports and Amazon Web Services partnership to quickly turn innovation into real-world impact. What began as a hackathon concept to address the reality that nearly 90% of Fox Sports Digital content is consumed vertically has evolved into a production-ready, machine learning-driven solution built for large-scale live sports. AWS Elemental Inference This solution integrates seamlessly into existing live production and distribution workflows to create and deliver vertical content in real-time to fans across multiple major sports leagues.”

— Ricardo Perez-Selsky, Senior Director of Digital Production Operations, Fox Sports

If you are interested in building a similar workflow, we will be publishing a reference implementation soon.

Read more



Source link