Codec conundrum: Television needs a new unified coding framework that can also cut complexity & cost

By Dr Amal Punchihewa
When discussing the future of television, it is important to recognise that one of the key technologies shaping both television distribution and consumption is video coding. Broadcast and media services rely on innovations in communication technology and compression technology. Communication technology expands the transmission bandwidth, while compression technology reduces the data size to fit within the available bandwidth.
Communication and compression technologies collaborate with each other but sometimes compete with each other. For example, when communication speeds increase via fibre and 5G, the importance of compression technology decreases.
However, video codec standardisation must be well aligned with communication technologies.
As an IEEE-BTS Distinguished Lecturer and a member of the Media Technology Network of the Institution of Engineering and Technology (IET), I recently gave a two-hour lecture to undergraduate students of Peradeniya University in Sri Lanka; they are studying image and video coding. Some of the key topics that were covered included: What drives video coding, the evolution of codecs, quality evaluation, and more importantly, understanding the problem they are trying to find a solution for.
The students themselves, asked several pertinent questions:
- “How do GPU and CPU manufacturers adapt to evolving video codec standards, especially as codecs become more complex?”
- “Since new processors have dedicated hardware for encoding and decoding, how do they handle these changes over time, especially in mobile devices?”
- “What are the key challenges in maintaining performance, efficiency, and compatibility?”
This article will provide the answers to the questions above as well as discuss and analyse the views expressed by some speakers at a recent ITU workshop on the future of video coding.
In the 1980s, I started my career as a computer engineer and had the opportunity to deal with computers that could handle special graphics, including Commodore’s Amiga. Nicolai Otto, Project Management Director at MainConcept, and a speaker at the ITU workshop, mentioned how one of the first products offered by his company was a motion JPEG decoder for Commodore’s Amiga in 1993.
As a signal processing expert with a focus on image and video coding, I have followed with interest on how codecs have been deployed and implemented through various life cycles, and how they have evolved over the four decades.
Samsung, for example, says they make consumer devices for every conceivable purpose, such as mobile phones and large-screen TVs. If a codec is used in the market, Samsung needs to support all devices; hence, Samsung has been contributing to codec standardisation and investing in it.
As screen sizes of television displays continue to increase, more pixels are needed. In turn, more efficient codecs are required to manage ever-increasing data from television spatial definitions. As a result, new generations of codecs typically achieve bit rate reductions of 30% to 50%.
However, as of 2025, the market for Ultra High Definition (UHD) television has not fully materialised. UHD-1, often referred to as 4K, has seen limited market adoption, and the rollout of UHD-2, or 8K, is progressing more slowly than expected.
Along with the increase in spatial definition, factors such as extended colour gamut (Wide Colour Gamut or WCG), high dynamic range (HDR), and high frame rate (HFR) also contribute to a higher data rate. Additionally, film grain, a common challenge in both legacy and modern content, requires a substantial amount of bits to accurately deliver.
Today, content is consumed on both large television displays and portable devices such as mobile phones, tablets and laptops. These portable devices do not require 8K or 16K spatial resolution, as the proximity to the eye for a given screen size makes such high resolutions unnecessary.
Recently, video services have expanded through over-the-top (OTT) platforms, with one of the biggest challenges being the growth of Free Ad-Supported Television services (FAST) services, which offers free video content in return for the viewing of advertisements.
FAST services have been growing, but one unique aspect is that they typically use low resolutions, such as 720p and older codecs like H.264, as they are designed to operate at a very low cost.
Video codecs such as MPEG-2, H.264, and H.265 have already been widely implemented and some codec implementers are currently working on H.266 for video compression, and exploring the viability of the next-generation video codec, H.267.
Nicolai, during his presentation at the ITU workshop, segmented the codec life cycle into four phases: Standardisation, Implementation, Adoption, and Application.
The first phase involves the initial period with standardisation organisations like ITU and ISO, lasting about three to five years, including the research phase.
Next, the document is handed over to companies for implementation, marking the implementation phase. Developing an encoder can take significantly longer than two to three years. Within two to three years, the standard specifications evolve into a minimum viable product that manufacturers can begin shipping to interested customers.
Customers then enter the adoption phase alongside manufacturers. During this phase, manufacturers promote the codec, access its value, evaluate what it can deliver, and determine its potential for their anticipated applications.
After a few more years, the codec hopefully reaches the market and transitions into the application phase. This is when users truly figure out how to utilise the codec, and it can last for a long time, as we have seen with Advanced Video Coding (AVC) or H.264.
Recently, the industry has seen a decline in adoption, rising costs of patent pools and royalties, as well as increased latency and complexity in codecs. It is crucial to evaluate whether new codecs provide sufficient coding efficiency to offset these negative trends.
Considering the total time required for a codec to be adopted and reach the application phase, Nicolai suggests parallelising the codec specification process with reference implementations to reduce time to market and improve predictability.
Another suggestion made during the ITU coding workshop was to adopt a unified framework. While creating a unified design is challenging, it is considered valuable by the standards community. Exploring a unified framework to integrate various coding tools proposed by different proponents into a single design could reduce complexity and cost. A historical example of this is the removal of complex, low-value techniques during the H.264 standardisation process, which highlights the benefits of unifying tools.
With the emergence of several video standards, conducting subjective video quality assessments, particularly when comparing them, becomes challenging. As a result, there has been a shift towards using objective video quality assessment techniques.
Peak Signal-to-Noise Ratio (PSNR) has traditionally been used as a metric for image and video quality assessment. However, it does not strongly correlate with human perception or subjective evaluation. It is recommended to use a quality metric that more closely aligns with what humans can actually perceive. The video and media industry needs a quality metric that places greater emphasis on perceived quality.
As PSNR is a pure objective quality metric that does not correlate well with perceived quality, better proxies are required for subjective video quality evaluation. As a result, perceived quality metrics like Video Multimethod Assessment Fusion (VMAF) are becoming more important than PSNR.
When codecs are used for live sports events, they are just one component of the overall process. The distribution chain for video, especially for live sports events using adaptive bit rate streaming, presents its own set of challenges that need to be solved concurrently.
Modern video technologies designed for widespread mass-market deployment require solutions that are practical, robust, low-power, and low-cost, while also offering state-of-the-art compression capabilities. The entire processing chain encompassing pre-processing, encoding, storage, transmission, decoding, post-processing, analysis, and repurposing of video content, must be considered, ensuring it can support high resolution, high frame rates, and high dynamic range.
While more compression is valuable to save bandwidth and storage, it should not come at the cost of an overly complex implementation. Additional coding tools can increase the complexity of the coding process at a rate that exceeds the benefits in coding efficiency. These diminishing returns in coding efficiency lead to a more complex and expensive codec, primarily due to the high number of royalties tied to the patent pool. This, in turn, negatively impacts the adoption of the codec.
Decoding should be CPU friendly, avoid unnecessary complexity, and should not excessively drain battery life on mobile and portable devices.
Ongoing research is focused on enhancing video compression using AI tools. However, as AI may introduce additional computations and complexities, it is important to ensure that any techniques incorporated are justifiable. The development of next-generation video codecs should prioritise affordability, perceived quality, and CPU efficiency.
Bitrate saving should not be the primary motivation for developing a new codec standard, as current networks are already capable of delivering UHD-1 (4K) and UHD-2 (8K) content.
The primary focus for next-generation video codec development should be on perceived quality and CPU efficiency. Additionally, further research is needed on low latency scenarios, such as live sports, gaming, and video calls, to address new requirements for future video coding standards.
Other priorities should include reducing licensing costs and lowering complexity, while maintaining previous-level coding gains. Achieving significant coding gains in commercial implementations presents challenges, so focusing on more attainable coding improvements could be a more realistic target.




