Telecoms.com periodically invites expert third parties to share their views on the industry’s most pressing issues. In this piece telecoms engineer Natalia Molinero Mingorance looks at the environmental implications of the video streaming revolution.
Video streaming is widely used in many areas such as entertainment, social media, and online education. The consumption of video has dramatically increased during the last decade, lately accelerated with the deployment of 5G which allows us to stay connected at any time and in any location. In fact, Internet Service Providers have reported that about a third of their network traffic is Netflix. It has more than 200 million subscribers worldwide in over 200 countries.
In this article, we are going to analyze what happens when we watch movies and TV shows on the popular platform. The goal is to understand how it works so we can be aware of the carbon footprint of our online behavior. How? By analyzing the computational complexity of processing video. Available literature (including publications made by Netflix itself) shows that the video encoding and encryption techniques are the most energy demanding tasks.
Netflix behind the scenes
What does it happen when we watch an episode of our favorite episode on Netflix?
First, that video file has to be properly stored somewhere. Content providers (e.g., video production companies) upload a copy of the movie to a server; this is the Origin Storage or Central Place in the Netflix Content Delivery Network (CND). The original video in a high-definition format (with no compression) can have a size in terabytes. Actually, to get an idea of the storage requirements, image that:
- The average number of videos reproduced by a user per day is 5
- The average size of a movie is 1 GB
- The average number of films uploaded per day by content creators to this backend is 1,000
Therefore,
- The total upload storage required per day is 1,000 * 1 GB = 1 TB (approximately)
- The total storage required in the Central Place of the CDN for 5 years is 1 TB * 5 * 365 = 1.8 PB
These are quite conservative numbers, as High-Definition movies without compression can weight more than 1 GB.
They use Amazon Web Services (AWS) for almost everything except streaming. That includes online storage, recommendation engine, video transcoding, databases, and analytics.
Processing the video
Netflix supports more than 2000 devices and each requires different resolutions and formats. After adding some meta data to control content management operations (name of the movie, category, licensing…), the video is encoded for different qualities, so it can be reproduced with different bandwidths, depending on the user device (PC, TV, smartphone) or the network requirements. To achieve this, Netflix breaks the original file into different smaller chunks and, using parallel workers in AWS, it converts these chunks into different formats (like mp4, 3gp, etc.) across different resolutions (like 4k, 1080p, and more).
A configuration file (“manifest file”) is produced to tell the encoder the parameters that should be applied for encoding.
These different technical features on the movies are achieved using audio and video codecs.
This way, an example of what Netflix typically stores in one of those nodes are the films representations corresponding to the combination of:
- 10 resolutions to allow for bitrate adaptation to the terminal
- 4 codecs, depending on the terminal the video is played on (VC1, H.264/AVC Baseline, H.264/AVC Main and HEVC)
- 3 ABR technologies (HLS, Smooth Streaming and DASH), in order, also, to address different terminals.
This leads to 120 representations of the same movie with the goal of streaming it to more than 900 different devices
After these operations are run in the AWS servers, the content is DMR coded, i.e., encrypted, and only users that paid for it can watch it.
Figure 1. CDN architecture, by Rafael Mompó
Copying the video in different locations
Next, all versions of the content (one for each type of device and for each image quality) are uploaded to a hard disk (regional origin server or Control Place at the CDN) to be distributed. Through Secure FTP, the files are received at the CDN Entry Point or Central Place. Therefore, movies are usually ingested in the Control Place. The Central Place stores a “master” copy of them and keeps information about those that are stored in all regional nodes and how busy the nodes are (occasionally, there is traffic overflow from one region to another). Open Connect is the Netflix CDN, a network of distributed servers spread across different geographical locations, responsible for streaming.
As we’ve seen, all movies are stored in the Central Place of the of the CDN, but a copy of the movie is stored in a Node closer to the users when they request it. These nodes store the films in the different formats according to the device or type of services (PC web browser, Smart TV, Android APP….). The more users requesting a video, the more nodes should be in a given region. After certain time, if users don’t request it, it’s deleted. These nodes are in charge of:
- Download management: check if the video is stored locally. Otherwise, it downloads it from the Control Place, from the Central Place or from another one closer neighborhood node.
- Neighborhood management: it maintains a list of all contents that are available on the nodes that are closer
- Streaming server: controls that the assigned bandwidth to a certain user is within the configured limits
- Storage management: it deletes movies when the disk is near to be full, but before deletion, it checks that there is not a user playing that content
- Geo-blocker: checks if the user IP corresponds to a country that is allowed to watch the video, since content licensing is given by country
To have movies/shows in nearer nodes, proxies can be placed between nodes and the regional origin server.
A quick summary till this point: when you hit the play button, Netflix analyzes the network speed or connection stability. Depending on the device and screen size, the right video format is streamed into the user’s device. The movie will be served to the user from the nearest Open Connect server which leads to faster and better experience. This also increases the scalability of the whole system.
Storage Capacity
Netflix uses Cassandra for its scalability and lack of single points of failure and for cross-regional deployments. A single global Cassandra cluster can simultaneously service applications and asynchronously replicate data across multiple geographic locations.
Cassandra data model in Netflix is built with the following elements:
- More than 50 Cassandra clusters
- More than 500 nodes
- 30TB daily backups
- 250k Writes/s at each node
The cost of watching online video: Netflix side
As you can imagine, Netflix must have several kilometers of data centers to keep those servers. Therefore, powering those servers running 24/7 for video processing and storage is expensive and, even if they use green electricity in some, the rest of them will have tons of associated green house gas (GHG) emissions. Actually, Netflix has reported being responsible of more than 6 billions of grams CO2 last year.
The following image shows the functional scheme of the elements that are in the Netflix side:
Figure 2. Netflix CDN architecture
The video headend is where the original file is uploaded and prepared before sending it to the servers in the CDN. The encoding and decoding tasks are the most energy demanding. To save some energy, the transcoding is done as close as possible to the user, so it’s performed on demand-basis instead of having multiple versions of the same file already processed and stored. The aim is to save some energy from computing time and feed the servers for storage. Let’s take a look at the most expensive operations:
As we can see in the picture above, the video frames are delivered to us thanks to the streaming protocols. These protocols can be in the three layers of the OSI model: Application, Presentation and Session layer. Some examples are Adobe Real-Time Massaging Protocol (RTMP), MPEG-DASH, Apple HTTP Live Streaming Protocol (HLS), and the ever used, QUIC which is based on UDP and relies in the transport layer. Each of these protocols is suitable for certain types of video containers, i.e., MP4, which is broadly used because it is compatible with a wide range of devices. The protocols are also defined for specific video codecs, being H.264, H.265, and VP9 the most common.
Compression is essential before storing or transmitting the video. We have checked the encoding used by Chrome while watching a show in Netflix:
Figure 3: Chrome video players for Netflix
Figure 4: Video transcoding techniques on Chrome while reproducing Netfix
Therefore, it is VP9, a compression license-free standard mainly defined by Google.
These streams go over a TLS/SSL connection to protect the communications with encryption, typically, different AES-based methods. AES is a block-based cypher which was first published in 1998. The blocks have a fixed size of 128 bits and the key can be 128, 192 or 256 bits. This is an extensively accepted cypher because it is open source and the hardware requirements in terms of storage and processing time are relatively low. In addition, it is easy to implement. As mentioned, QUIC is a transport protocol widely used and it includes encryption. Another streaming protocol, RTMPS, relies on a TLS/SSL connection which also ensures the streaming protection via encryption. We have seen these protocols while capturing the Netflix traffic with Wireshark:
Figure 5. Video protocols for Netflix
These protocols are used by the platforms under study, and all are based on AES. In addition, we have analyzed the internet traffic while watching the show to double-check the encryption scheme they use: AES-128 CGM, as shown in Figure 6.
Figure 6. Encryption scheme used by Netflix
Encrypting video data results in a higher level of security, but it also produces more computational complexity, and it costs more time.
While compressing and encrypting a text file is relatively simple, video compression and encryption are more complex, as computations need to be performed on lots of megabytes of data that are continuously stored, retrieved and transferred from data centers and end-user devices. This computational cost results in energy consumption. For this reason, how to get a good balance between compression, security and time-energy efficiency is an important objective to achieve a sustainable way of enjoying online video.
The cost of watching online video: energy consumption at home
We have measured the power consumed on a laptop and a TV while reproducing Netflix. The laptop is an Acer Aspire F 15 and the TV is a LED Samsung Smart TV of 32’’.
All the power values reported here are the average power measured during 5 minutes of video. The TV consumption when off was 13.7 W, while the laptop didn’t consume energy when turned off. The price of electricity in Spain the day of the tests was 0.20341 €/kWh. Tables 1 and 2 show the results of the consumed power values and the value of playing those 5 minutes of our favorite show on the laptop and on the TV.
Laptop (W) | TV (W) | |
IDLE | 7.2 | 25.5 |
Watchig video | 15.6 | 45.13 |
Table 1.- Power consumption of a laptop and a TV when idle and when watching Netflix.
Laptop (€) | TV (€) | |
IDLE | 0.00012 | 0.00043 |
Watchig video | 0.00026 | 0.00077 |
Table 2.- Economic cost of when watching Netflix on a laptop and a TV.
As you can see, streaming 5 minutes of Netflix on a TV (the most expensive device between the two tested) is relatively cheap. In a year, if we watch 2 hours of content every day, you should multiply the monthly subscription price by 12 and add 6.7 € for the electricity consumption.
Associated carbon footprint
Maybe the economic cost is low but let’s take a look at this: according to carbonfund.org, the generation of 1kWh produces 0.371 kg CO2. If playing 5 minutes of video consumes 45.13 W and we consider we watch 2 hours every day, in a year, that would be 33 kWh annually, which is 12.2 kg CO2 produced by each of us just for streaming online content.
In Spain, there are approximately 19,000,000 family houses. If Netflix is watched an average of 2 hours in each house, the yearly pollution generated by reproducing Netflix on TV, considering that the 40% of the energy in Spain is generated with renewable sources, is 139.1 kt grams of CO2. Thus, maybe we don’t even notice the electricity prize due to Netflix in a year but the planet certainly does.
Solutions
While every time more companies like Netflix are investing in renewable energies for their data centers and CDNs, this is not a complete solution, as the green energy is still expensive and difficult to access in some countries and not all the processes since the stage when the video is created till it reaches our screens use this renewable sources. Actually, most users have traditional (for instance, gas generated) energy at home, producing dozens of kg of GHG emissions every day, as in the case of Spain.
As we have seen, at the server side, the Netflix CDNs and data centers consist in very powerful servers storing and processing large volumes of video. Similarly, the peak on the energy consumption that we observed in our laptop and TV is due to the video encoding algorithms that require so much computational load to run, especially compression algorithms (actually, much more than encryption).
While video compression techniques are required to reduce the capacity and the number of servers, at present, there is a trade-off between the compression rates and the computational power required to achieve a substantial reduction: the bigger the compression, the more resources are needed. Therefore, there is a need to design lighter algorithms with the aim of:
- Obtaining high qualities videos
- Using minimal computing resources
- Reducing the energy consumed
Harmony Valley is a research project dedicated to design these algorithms for the new and required paradigm of a sustainable digital world, so everyone can enjoy the most advanced technologies without hurting the planet.
I’m Natalia Molinero Mingorance, a Telecommunications engineer and MSc. in Wireless Communications Systems. I worked in the University of Sheffield (U.K.), as a researcher, and I also worked in industry, providing services to the main European telecom carriers. In April 2022, I was awarded the first prize in a competition run by the European University of Madrid (Spain), the “Singular Alumni”, for my project proposal “New Methods for Energy Efficiency Optimization: Information, generation, storage, transmission and reception”, which is the first stage of Harmony Valley. Harmony Valley is a research project (expecting to be an R&D company) whose main goal is to contribute to making digital communications sustainable. In this first stage, the research is focused on finding new video compression algorithms to save computing power. The long-term goal is to provide new methods and solutions to the different standards bodies (radiocommunications, network protocols, etc.) and multimedia private companies, so energy consumption is a parameter as important as quality when designing a new technology. I created a social media profile in order to make the research and the topics that addresses accessible by everyone (https://www.instagram.com/harmonyvalley_official/)