Best practices for optimizing video start time(VST)

Hey Community, I am back with another topic.

The video startup time(VST) is the amount of time it takes from user initiating (per click or autoplay) playback until the first frame is displayed to the user. It is a very important metrics for any OTT streaming service.

Studies have shown that after 10 seconds of startup delay, more than half of your audience has usually left and only 8% of users will return to your website within 24 hours after experiencing a video failure. Specifically, if it takes longer than 2 seconds to load the video, viewers will start to leave. After 5 seconds, more than 20% of your users abandon, and with each additional second of delay, 6% of your users leave and the majority will never come back. So optimizing VST is a very important exercise which should be undertaken at the time of player integration.

What factors contribute to VST

  • The major contributor to VST is network latency. To start playback, a player need to request multiple resources over the internet to start rendering first video/audio frame. Considering HLS and DASH to be de-facto OTT streaming standards, below is a high level list of network requests that must be completed before a device is able to render first frames. Higher the network latency, higher will be VST. So using a CDN to deliver content from a location nearest to user is very important.

    • DASH MPD file or HLS Multivariant playlist
    • HLS variant playlists for audio and video
    • Audio and video init segments (in case of MP4/fMP4 segments)
    • At least one audio and video segments each
    • If content is DRM protected, then following additional request are required
      • DRM certificate request (required for some platforms like Chrome/Firefox/Safari browsers)
      • DRM license request
  • As an OTT system has no control over a user’s internet connection and speed, so delivering content using a good ABR ladder is very important. This allows player to start playback quickly by starting with lower renditions.

  • Amount/duration of media buffered by player before starting playback can also impact VST. Buffering ahead or pre-loading is required to have enough media in buffers at the time of playback start to avoid re-buffering after playback is started.

  • Player setup time. Player stack also needs time to setup before it can start downloading and processing media. This value is generally small in comparison to other factors, but for some platforms like low end TV devices, this could still be considerable number.

  • In general DASH/HLS playlists and media segments are delivered via CDN to minimize latency but DRM licenses and certificates are delivered by servers run by DRM service vendors and these responses cannot be cached. So if your content is DRM protected, make sure that your DRM service provider is able to cater to DRM requests at scale and with minimum latency.

Best practices for achieving good VST with Bitmovin player SDKs

  • Practice 1 : Configure player to avoid starting playback with highest bitrate renditions. The higher the startup rendition’s bitrate more time will it take to download video segments adding to the VST. Out of all the resources that are downloaded before playback start, video segment. So selecting a reasonable rendition(for example middle rendition of your ABR ladder) helps reduce VST. Please also consider your target device screen size. Selecting lower bitrate rendition(let’s say 360p) for mobile device may be ok but will result in poor user experience for TV devices. Bitmovin player SDK’s provide following APIs to configure this.

    • Web SDK (Desktop/Mobile browsers, Samsung Tizen, LG WebOS, XBox, PS5) : One of below 2 configurations can be used to either set the startup bitrate or limit the max startup bitrate.

    • Android/AndroidTV/FireTV SDK : AdaptationConfig.InitialBandwidthEstimateOverride. Internally player uses this as the initial bandwidth estimate for calculating which video rendition to pick at startup. Please use a calculation like below to calculate initialBandwidthEstimate for starting with video rendition with bitrate bitrate_video.

      initialBandwidthEstimate = (bitrate_video + bitrate_audio) * 1.6

    • iOS/tvOS SDK : The startup rendition in case of iOS/tvOS is the default HLS variant which is the first variant listed inside the HLS Multivariant playlist. So choosing the first listed variant wisely during HLS packaging can help improve VST for iOS and tvOS devices.

  • Practice 2 : Configure player to start playback without buffering large duration of media. The larger the duration of buffered data at time of playback start, lesser will be the chance of getting into re-buffering but higher will be VST. So this is a trade-off. Normally 1 segment worth of buffered data at the time of startup should be good but again this depends on the segment length of your content. Would recommend to keep these values to default if there is no specific reason to set custom values.

  • Practice 3 : Although player stack setup time is negligible running only into 10s of milliseconds on most platforms/devices, but some devices like low end TVs can take substantial time running into few 100s of milliseconds or even upto 1-2 seconds just to initialize the player stack. So, for platforms like Samsung Tizen, LG Webos, XBox, PS5, Hisense, Panasonic, Vizio etc(please see list of all supported and compatible devices here), it is required to use modular approach to integrate required Bitmovin Web player SDK modules instead of integrating the whole Bitmovin player SDK. There could be 10-20% improvement in VST by using modular player approach. Please refer to Bitmovin sample apps for Tizen and WebOS and generic Bitmovin web sample app to learn more about modular player approach.

  • Practice 4 : For devices which support multiple DRM systems, the VST can vary depending on the DRM system used for playback. So it is recommended to test the available DRM system which are supported by Bitmovin player for given device/platform to check which one gives better VST for your streams. For example we have learnt from our customers that using PlayReady instead of Widevine on Samsung Tizen and LG WebOS gives better VST. Please see Bitmovin DRM support for the list of supported DRM systems with Bitmovin player SDK.

Hope above best practices help you optimize VST when integrating Bitmovin player SDKs and deliver best experience to your users. Please let us know your feedback/experiences/questions about VST in comments.