GPU Acceleration for Cloud Video Encoding

Originally published at: GPU Acceleration for Cloud Video Encoding: Multiply Revenue with Faster Encoding

Bringing Industry-leading Turnaround Times to Short Form Content

Bitmovin has recently seen an increase in demand for processing short form video content and faster turnaround times. This post will cover the most common use cases for short form video and how our new GPU acceleration for VOD encoding will benefit and add value for those workflows. Keep reading to learn more. 

Table of Contents

Growing demand for short form video use cases

Ads

With the rise of FAST channels and more subscription services adding ad-supported tiers, the presence of video advertisements is growing beyond traditional AVOD workflows. Companies who have optimized their encoding workflows for longer form episodic and cinematic content are now looking for ways to add more encoding capacity and provide quicker turnaround times than they’ve needed in the past. Whether it’s from a new sponsorship deal or contract with a new ad network or sometimes adding a whole new tier to their service, ads are often dumped on the video processing team in large batches with little notice and need to be ready to serve to customers almost immediately, many times with revenue at stake. 

News and Sports clips

Being the first to publish news clips or sports highlights offers a competitive advantage in the world of media and journalism, enhancing credibility and authority with not only the viewing audience, but also advertisers and sponsors. It increases social media visibility and the potential for virality, which provides free promotion and more impressions, leading to increased monetization. Timely publication also has direct SEO benefits and increases audience engagement and viewer loyalty, all of which increase organic traffic that can be monetized through other channels and grow the potential audience for future deals. 

User generated content (Stories, Reels, etc)

Video is no longer limited to specialist platforms and has become “table stakes” for any social app. With competition for eyeballs being tougher than ever, a smooth, seamless experience is key for keeping users engaged and delighted. Any video that gets recorded and shared needs to give the poster a feeling that it was available almost immediately, which is not a simple task. 

Highlights of recent performance improvements

Bitmovin’s split-and-stitch cloud architecture enables massive horizontal scale by allowing different parts of a video to be processed simultaneously and then reassembled for playback. This type of workflow works especially well with longer content, something we showcased when demonstrating the first video encoding that processed 100x faster than real-time. While it benefits longer form content, this approach adds some overhead time that becomes more noticeable when working with shorter videos. As we heard more demand for quicker turnaround of ads and short clips, earlier this year we began optimizing our workflow with this type of short form content in mind.

In April at NAB 2023, we started sharing some of the early results of this effort, demoing 20-30 second turnaround times for the entire 720p adaptive bitrate (ABR) ladder for a 30 second clip using the H.264 codec. This was a nice step forward, but we saw that with more GPU instances becoming available in the cloud, there was potential to make more significant leaps in progress by focusing our attention on the speed provided by this newly accessible hardware.  

GPU acceleration testing and evaluation

The first step to begin our GPU experimentation was to select which GPU we would use for the initial evaluation. After surveying what was available in the major public clouds and doing some estimations, we settled on the Amazon EC2 G4dn instances, which are powered by NVIDIA T4 GPUs. They can deliver up to 40x better low-latency throughput than CPUs and up to 2X video transcoding capability than the previous G3 instances, so those performance gains, combined with NVIDIA’s reputation and the cost-effectiveness of the G4 instances made them the ideal choice for our trial.    

After completing some successful manual tests, the next step was to integrate the GPU instances into the Bitmovin scheduling logic so we could use them like any of the other cloud CPU instances that encoding jobs would normally run on. This would allow true apples-to-apples comparisons of total turnaround times for our workflows including queueing, analysis and encoding times. Integration with our Encoding API would also allow us to begin a closed Beta trial with customers who had expressed interest in our conversations at NAB.

The initial un-optimized GPU workflow provided some improvements in encoding time over CPU and we were able to turn around a full 1080p ABR ladder in less time than our 720p NAB demo. Still, we saw opportunities to shave off even more time and had some teams dig deeper during an internal hackathon over the summer. Through their discoveries and ongoing work, we’ve been able to accelerate scheduling and instance retrieval and are now able to achieve as low as 15 second turnaround times with customer-provided videos that are up to 5 minutes long! Working with the NVIDIA GPU also allowed us to add fast turnaround support for the H.265 (HEVC) codec in addition to H.264, which would not have been possible with CPUs alone. 

As the volume of online videos continues to grow exponentially, demand for solutions to efficiently search and gain insights from video continues to grow as well. T4 delivers extraordinary performance for AI video applications, with dedicated hardware transcoding engines that bring twice the decoding performance of prior-generation GPUs. T4 can decode up to 38 full-HD video streams, making it easy to integrate scalable deep learning into video pipelines to deliver innovative, smart video services.

source: https://www.nvidia.com/en-us/data-center/tesla-t4/

Value add for key use cases

So what’s the end result of all of this work and optimization? We hope faster processing and turnaround times will lead to revenue growth for our customers by enabling higher ad fill rates and reducing the time between contracts being signed and ads being served to viewers. The flexibility of our cloud-native encoding solution is also key for absorbing spikes in demand and freeing them from investing in resources that may sit idle for long periods of time.

Giving our media customers the ability to be first to the consumer with news and sports clips boosts their chance to go viral while also growing brand awareness. It will have short term benefits for delivering more ad impressions and revenue along with longer term benefits like SEO and increasing organic traffic that can justify higher CPMs and larger sponsorship packages in the future.

Social and consumer apps will be able to improve the UX of their customer workflows, making video sharing seamless and interruption free with faster processing than ever before. Bitmovin’s customers will also benefit from a seamlessly improved experience, with GPU acceleration for VOD encoding being as easy to use as their existing CPU workflows. 

What’s next for Bitmovin’s GPU acceleration?

Bitmovin has always been committed to continuously improving and that still applies with our explorations into cloud GPU encoding. One area where we’ve done some preliminary testing is providing more Bitmovin API gateways in strategic geographic locations. This may help shave a few more seconds off total turnaround time for some customers, especially for more complex encoding jobs with multiple API calls.

We pride ourselves on being the cloud-agnostic encoding solution, so pending customer feedback from our beta trial period, we will look at expanding GPU-encoding capabilities to more regions and other public clouds beyond AWS. NVIDIA also has other GPU instances available in the cloud, so those may be worth exploring and benchmarking against once our initial workflows move into production mode.   

Bitmovin and the University of Klagenfurt are currently collaborating on a 2 year R&D project called GAIA that is aiming to make video streaming more sustainable and help the industry reduce its carbon footprint. One of the areas currently being researched is the relative efficiency and carbon footprints of CPU vs GPU acceleration for video encoding. We hope to share data and results of the experiments soon, but in the meantime, you can check out our recently published progress report here

Interested in cloud GPU encoding?

Bitmovin customers and free trial users can learn more about our short-form video processing in the Bitmovin dashboard. We also have a short-form content datasheet available for download here. Our beta trial period will be coming to an end soon, but if you’re interested in an early preview, get in touch with your Bitmovin representative or let us know in the comments below.

Let us know if you’re interested in learning more about GPU acceleration for video encoding with Bitmovin!