Live streaming at world-record scale with Ashutosh Agrawal
Live streaming at world-record scale with Ashutosh AgrawalIn May 2023, a live streaming world record was set with 32 million concurrent viewers watching the finale of the Indian Premier League cricket finale. A chat with the architect behind this system
Stream the Latest EpisodeAvailable now on YouTube, Apple and Spotify. See the episode transcript at the top of this page, and a summary at the bottom. Brought to You By• WorkOS — The modern identity platform for B2B SaaS • CodeRabbit — Cut code review time and bugs in half • Augment Code — AI coding assistant that pro engineering teams love — In This EpisodeHow do you architect a live streaming system to deal with more load than any similar system has dealt with before? Today, we hear from an architect of such a system: Ashutosh Agrawal, formerly Chief Architect of JioCinema (and currently Staff Software Engineer at Google DeepMind.) In May 2023, JioCinema set the live-streaming world record, serving 32 million concurrent viewers tuning in to the finale of Indian Premier League (a cricket game.) We take a deep dive into video streaming architecture, tackling the complexities of live streaming at scale (at tens of millions of parallel streams) and the challenges engineers face in delivering seamless experiences. We talk about the following topics: • How large-scale live streaming architectures are designed • Tradeoffs in optimizing performance • Early warning signs of streaming failures and how to detect them • Why capacity planning for streaming is SO difficult • The technical hurdles of streaming in APAC regions • Why Ashutosh hates APMs (Application Performance Management systems) • Ashutosh’s advice for those looking to improve their systems design expertise • And much more! TakeawaysMy biggest takeaways from this episode: 1. The architecture behind live streaming systems is surprisingly logical. In the episode, Ashutosh explains how the live streaming system works, starting from the physical cameras on-site, through the production control room (PCR), streams being sliced-and-diced, and the HLS protocol (HTTP Live Streaming) used. 2. There are a LOT of tradeoffs you can play with when live streaming! The tradeoffs between server load, latency, server resources vs client caching are hard decisions to make. Want to reduce the server load? Serve longer chunks to clients, resulting in fewer requests per minute, per client… at the expense of clients potentially lagging more behind. This is just one of many possible decisions to make. 3. At massive video streaming scale, capacity planning can start a year ahead! It was surprising to hear how Ashutosh had to convince with telecoms and data centers to invest more in their server infrastructure, so they can handle the load, come peak viewership months later. This kind of challenge will be nonexistent for most of us engineers/ Still, it’s interesting to consider that when you are serving a scale that’s not been done before, you need to worry about the underlying infra! 4. “Game day” is such a neat load testing concept. The team at Jio would simulate “game day” load months before the event. They did tell teams when the load test will start: but did not share anything else! Preparing for a “Game day” test is a lot of work, but it can pay off to find parts of the system that shutter under extreme load. The Pragmatic Engineer deepdives relevant for this episode• Software architect archetypes • Engineering leadership skill set overlaps • Software architecture with Grady Booch Timestamps(00:00) Intro (01:28) The world record-breaking live stream and how support works with live events (05:57) An overview of streaming architecture (21:48) The differences between internet streaming and traditional television.l (22:26) How adaptive bitrate streaming works (25:30) How throttling works on the mobile tower side (27:46) Leading indicators of streaming problems and the data visualization needed (31:03) How metrics are set (33:38) Best practices for capacity planning (35:50) Which resources are planned for in capacity planning (37:10) How streaming services plan for future live events with vendors (41:01) APAC specific challenges (44:48) Horizontal scaling vs. vertical scaling (46:10) Why auto-scaling doesn’t work (47:30) Concurrency: the golden metric to scale against (48:17) User journeys that cause problems (49:59) Recommendations for learning more about video streaming (51:11) How Ashutosh learned on the job (55:21) Advice for engineers who would like to get better at systems (1:00:10) Rapid fire round A summary of the conversationThe Live Streaming Pipeline
Content Delivery
Monitoring, Metrics, and Scaling
APAC-specific live streaming challenges
Advice for engineers to become better architects
Resources & MentionsWhere to find Ashutosh Agrawal: • X: https://x.com/theprogrammerin • LinkedIn: https://www.linkedin.com/in/theprogrammerin/ • Medium: https://medium.com/@theprogrammerin Mentions during the episode: • Disney+ Hotstar: https://www.hotstar.com/in • What is a CDN: https://aws.amazon.com/what-is/cdn/ • Adaptive bitrate streaming: https://en.wikipedia.org/wiki/Adaptive_bitrate_streaming • Skype: https://www.skype.com/en/ •Millions Scale Simulations: https://blog.hotstar.com/millons-scale-simulations-1602befe1ce5 • Black Friday: https://en.wikipedia.org/wiki/Black_Friday_(shopping) • Asia-Pacific (APAC): https://en.wikipedia.org/wiki/Asia%E2%80%93Pacific • Distributed architecture concepts I learned while building a large payments system: https://blog.pragmaticengineer.com/distributed-architecture-concepts-i-have-learned-while-building-payments-systems/ • Concurrency: https://web.mit.edu/6.005/www/fa14/classes/17-concurrency/ • Video streaming resources on Github: https://github.com/leandromoreira/digital_video_introduction • Murphy’s Law: https://en.wikipedia.org/wiki/Murphy%27s_Law_(disambiguation) • Java: https://www.java.com/ • Ruby: https://www.ruby-lang.org/en/ • Ruby on Rails: https://rubyonrails.org/ • Hacker News: https://news.ycombinator.com/ — Production and marketing by Pen Name. For inquiries about sponsoring the podcast, email podcast@pragmaticengineer.com. You’re on the free list for The Pragmatic Engineer. For the full experience, become a paying subscriber. Many readers expense this newsletter within their company’s training/learning/development budget. This post is public, so feel free to share and forward it. If you enjoyed this post, you might enjoy my book, The Software Engineer's Guidebook. Here is what Tanya Reilly, senior principal engineer and author of The Staff Engineer's Path said about it:
|
Comments
Post a Comment