Apple’s surprise purchase late last month, WaveOne, a California-based startup that develops content-aware AI algorithms for video compression, represents a significant shift in how video signals are streamed to devices. increase. In the near future, the Cupertino acquisition could lead to Apple’s video creation products and smart video compression tools in the development of the much-talked-about augmented reality headset.
But Apple isn’t alone. Startups in the AI video codec space could be acquisition targets for other companies trying to catch up.
For decades, video compression has used mathematical models to reduce the bandwidth required to transmit analog signals, focusing on the portion of a scene that changes from frame to frame. When digital video was introduced in the 1970s, improving video compression became a major research focus, leading to the development of many compression algorithms called codecs (short for “coder/decoder”) for compressing and decompressing digital media files. Connected. These algorithms paved the way for the current dominance of video in the digital age.
AI compression of still images shows early success. Video is still challenging.
New codec standards emerge about every decade, but they are all based on pixel math, manipulating the values of individual pixels within a video frame to remove information that is not essential for human perception. increase. Other math operations reduce the amount of data that needs to be sent or stored.
Developed over decades, AI codecs use machine learning algorithms to analyze and understand the visual content of videos, identify redundancies and broken data, and streamline videos in a more efficient manner. Compress. Using learning-based techniques instead of hand-designed encoding tools, a variety of methods can be used to measure encoding quality beyond traditional distortion measurements. Recent advances such as attentional mechanisms help us understand data better and optimize visual quality.
In the early 2010s, Netflix and a California-based company called Harmonic spearheaded the so-called “content-aware” encoding movement. CAE, as Harmonic calls it, uses AI to analyze and identify the most important parts of a video scene, allocating more bits to those parts to improve visual quality, and removing the less important parts of the scene. Lower the bit rate of the part.
Content-aware video compression adjusts the encoder for different resolutions of the encoding, adjusts the bitrate according to the content, and adjusts the quality score (the perceived quality of the compressed video compared to the original, uncompressed video). To do. All of these can also be done with a neural encoder.
However, despite a decade of effort, fully neural video compression using deep learning has not been able to beat the best practices of traditional codec standards under normal conditions. Reviews from third parties show that traditional video encoders still outperform neural network compression when benchmarked with traditional distortion metrics and human opinion scores, especially when traditional encoders are enhanced with AI tools. is shown.
WaveOne has had success with neural network compression of still images. In one comparison, WaveOne reconstructions of images were five to ten times more likely to be selected than traditional codecs by independent groups of users.
However, temporal correlation in video is much stronger than spatial correlation in images, and to beat the state-of-the-art, the time domain must be encoded very efficiently.
Yiannis Andreopoulos, Professor of Data and Signal Processing at University College London and Chief Technology Officer at iSIZE Technologies, said:
WaveOne may continue work on full neural video compression under Apple’s auspices. WaveOne’s public research reveals that its neural compression technology is incompatible with existing codec standards, which while working seamlessly together make for a product that is proprietary and tightly controlled by Apple. Conforms to Apple policy.
WaveOne founder Lubomir Bourdev declined to comment on the state of the technology, and Apple did not respond to requests for comment.
AI and legacy codecs work together for now, as legacy encoders can be debugged.
Nevertheless, the industry seems to be moving towards combining AI with traditional codecs rather than relying on full neural network compression.
For example, Vnova, according to its site, uses standardized pre-encoding downscaling and post-decoding upscaling to make its encoder more efficient and faster than its predecessor. However, the user needs software components on both the encoder and decoder sides.
London-based iSIZE enhances traditional video encoders with AI-based preprocessing to improve the quality and bitrate efficiency of traditional encoders. iSIZE users do not require any component on the receiver side. This technology simply produces a bespoke representation with pre-processing that makes the encoder more efficient. You can add post-processing components, but this is optional.
“By adding an AI component in front of the encoder, regardless of which encoder we are using, we are reducing the bitrate required to compress some element of each video frame,” said Sergio, iSIZE CEO. Grce said on a Zoom call. “Our AI component learns to attenuate details that a human viewer would not notice when watching a video played at his normal replay rate.”
The result is a faster encoding process and lower latency, says Grce. This is certainly a key advantage for her VR, where latency can make users nauseous. The files output by the encoder are significantly smaller, without any changes on the end user’s device, Grce said.
In theory everything in the video should be saved. An ideal codec would encode rather than modify all content received. As such, traditional encoders have focused on what is called a distortion metric. Such measurements include signal-to-noise ratio (SNR), structural similarity index (SSIM), and peak signal-to-noise ratio (PSNR). All of these provide a quantitative measure of how well the compressed video matches the original video in terms of visual quality.
However, in recent years, more and more attention has been focused on perceptual quality metrics, which consider how compressed video is perceived by human viewers. These metrics are intended to measure the visual quality of compressed video based on how humans perceive it, rather than just mathematical measurements. After all, some distortions are mathematically insignificant, but may still be perceptually noticeable. (For example, blurring a small part of a person’s face may not make much sense when considering the entire image or video file, but even small changes in such distinctive features can still be noticed. You can.) As a result, new video compression techniques are being developed. Consider both distortion and perceptual quality metrics.
These days, things have moved further into more perceptually oriented encodings, altering subtle details of content based on how humans perceive it rather than just mathematical measurements. This is easy to do because the encoder in , which works at the macroblock or slice level and only displays a small portion of the frame, whereas the neural encoder displays the entire frame.
For the time being, Andreopoulos said, “AI and traditional technology work together,” because traditional encoders are interpretable and debuggable. A neural network is a little-known “black box”. Whether neural encoding will outperform conventional techniques in the very long term is still an open question, he added.
WaveOne’s technology can be used by Apple to improve video streaming efficiency, reduce bandwidth costs and enable higher resolutions and frame rates on the Apple TV+ platform. The technology is hardware agnostic and can run on AI accelerators built into many phones and laptops. On the other hand, a large amount of data transfer and storage will be required if the Metaverse becomes a reality.
There are several companies working on optimizing standard video codecs using AI, including Bitmovin, Beamr, and NGCodec, now part of AMD.
from an article on your site
Related articles on the web
