This article is the second out of three articles diving into the world of video streaming, and the challenges of providing a user experience that is consistent with the very nature of live content. You can read the first article right here.
In our previous article we’ve seen how we are transitioning from legacy RTMP streaming to HTTP-based streaming. It provides obvious pro’s regarding quality of experience, but it also introduces larger overhead that needs to be mitigated if we want to keep the latency low. So let’s dig deeper into the way streaming video flows…
The workflow starts obviously with the recording: you want a device that minimizes the processing delay. A GoPro has usually a 3-frame delay, which translates into 33ms x 3 = 99ms at 30fps, whereas broadcast-grade cameras can take less than a video field (half a frame). Obviously, this delay can be increased by the codec, format, and resolution produced by your camera.
Of course, this is not where the bulk of the latency happens. When specifically producing adaptive bitrate streaming (allowing the users to play different qualities of the stream depending on their network conditions), there are multiple factors that go into determining latency, the most important of which are:
- Video encoder buffer duration
- Segment/fragment duration
- CDN delivery latency
- Player buffer duration
The video encoding introduces latency, which can vary depending on input resolution, parameters, first frame accessed... All this adds up, and transcoders are not equals in terms of added latency. The codec(s) chosen can have strong implications, as the most bandwidth-friendly often uses more complexity for the encoding process, therefore adding more latency.
Then the packaged video is hosted on an origin server, which will serve the data to the end-users, often through a CDN for multiple purposes, including wide geographical distribution, securing video from theft, providing access control, and delivery performance. The bulk of the latency actually gets introduced in the origin-to-player part of the path.
Adaptive bitrate streaming formats allow mitigation of the the client-side rebuffering issues we faced while streaming in the older days, by using chunks (e.g. fragments) of video downloaded independently, to ensure that your stream can be played back seamlessly.
But for many of the same reasons that these formats are great, they also have faults when it comes to latency. They require the user’s player to build up a buffer of chunks before starting to play the video. Default playback buffer sizes require a certain number of packets to create a meaningful playback buffer.
A streaming server will likely buffer 2 fragments on its side (i.e. between 12 to 20 seconds, depending on the fragment’s default size), the CDN delivery path will likely introduce at least a few seconds of latency in just getting fragments propagated through its network for the first time, and then finally the player will buffer however much data it deems necessary to provide smooth playback resistant to network jitter (let’s assume 10 seconds as for HLS). So, when you add that all up, the typical glass-to-glass latency is 40 seconds, while with some tuning we will see that could be reduced to 10-20 seconds.
In the real world it is not uncommon to see live events sometimes experience a latency of over 1 minute, though sometimes that’s by choice (e.g. customer choosing increased buffer & stability over low latency),
The streaming server above hosts every chunks of the video file, in 3 different bitrates. The client’s player below needs to download and buffer a number of chunks before it starts displaying the video.
For instance, our MMD Live product today supports a chunk size down to 2 seconds. What this means given our 3 segment manifest, is a 6 second latency + 1 second for traffic to get on and +1 second CDN exit delay = 8 seconds. Obviously, last mile quality matters for this to work well. That’s the current best low latency adaptive bitrate performance available today.
So, while these formats are widely used for video streaming today, they show their limits in the uses cases we previously discussed where the low latency is key. That’s why we will then turn to our 3rd article where we will explore new methods and protocols (think WebRTC, low latency DASH…) to make OTT streaming experience similar to broadcast. So stay posted!