BizTechReports Q&A: Alan Stein Makes Case for MPEG-H Audio in the ATSC 3.0 Standard

AlanSteinAs the ATSC 3.0 deliberates over what direction the future of sound in broadcasting will take, I had a chance to catch up with Alan Stein, Vice President of Technology at Technicolor – and a Technicolor fellow – to get his take on what the MPEG-H Audio Alliance proposal has to offer. The MPEG-H Audio Alliance is an initiative being led by Technicolor, Fraunhofer and Qualcomm. 

We ended up exploring the role of open standards in this process and its relationship with innovation, mobility and long-term total cost of ownership of providing next generation audio experiences to consumers.  Here is what he had to say:

Technicolor LogoQ: As you make your case for the MPEG-H Audio, one thing I have heard stressed is that the MPEG-H Audio is based on open standards. I understand that the MPEG-H Audio is a collection of patents, but what does it mean to say that they are part of an open standard?

Alan: The MPEG-H Audio standard was developed in MPEG with several companies participating, each bringing what they believed to be the best next generation technology to be evaluated in the technical community. Each proponent of a technology had to propose particular test scenarios for the technology they championed. Eventually MPEG agreed to test scenarios and the best ones were selected and specified in the MPEG-H Audio standard.

In my opinion, it’s very different from how some others do it, where a single company may create a pseudo-standard internally without the benefit of peer review or testing against other systems, and then submit that specification document to ETSI for review and approval. Ultimately, ETSI may review it to say “this looks like a standard and we don’t see anything wrong with it.” But that’s not the same as taking technology though a more rigorous competitive approach.

In MPEG, there’s reference software, so in addition to writing documents and submitting these documents explaining what the technology is supposed to do, you also have to submit the software embodiment of that technology to a peer review process. This allows for independent verification that the technology does in fact work in the way that is described in the specification.  This is all done in a publicly-available manner to verify what’s in the standard.

I believe this is a huge distinction vs more proprietary technologies.  Technologies based on open standards allow vendors who want to create their own implementations — for example companies that design their own chipsets or an independent software implementation of that standard — to do so in a straightforward manner. By contrast – in what I’ll call “the pseudo-standard approach” – the community may be able to get a software development kit (SDK) from the single company that developed a specification.  But there is no way to know whether that software actually matches the standard document that is the specification. For example, imagine a large television manufacturer creates their own software stack. They have to test it against the single vendor’s implementation. But if something is wrong, how can you determine who’s at fault, and how can you fix it?

This scenario could lead to substantial delays in rolling out new products.

With an open standard and reference software always available, many people may develop their own implementations and cross-test and cross-verify the system. In so doing, the ecosystem that supports the open standards based technology grows.

This is what happens in video.  The broadcast industry, the cable industry, the satellite TV industry – and even over-the-top video industry – all use open video standards. They all are able to get encoders, decoders, splicers and other equipment from multiple vendors who all adhere to the standard. This is how it works in video, and we believe this is the best approach for audio as well. It is a key attribute of MPEG-H Audio Alliance proposal.

Q: It occurs to me that one of the interesting discussions that has been covered as the ATSC moves forward, is that the decision may be about much more than broadcast. I just saw surveys come out from different sources that streaming programming — and specifically streaming programming on mobile devices — has now surpassed at-home, traditional, programmed television experiences. To what extent do those standards play a role in ensuring a consistent experience as you go from the living room, to a tablet to a smartphone or other mobile device?

Alan: That’s a great point. The mobile experience is central to ATSC 3.0’s ambition. The broadcasters do not just want to broadcast video over the air; they want to augment that with over-the-top. For example, they want the ability to have a second audio language streamed over broadband, as opposed to being only broadcast over the air. ATSC 3.0 is being built for your living room television and for your mobile device. The physical layer, which is about to be ratified as a candidate standard specification, is based on technology that enables mobile reception, as we have twice demonstrated during our tests with Sinclair in Baltimore.

MPEG-H Audio Alliance members have deployed the most successful codecs for the mobile space. It is estimated that there are more than 8 billion implementations of MP3 and AAC in devices around the world. We have been leaders in audio technology for mobile for more than a decade. We think that experience and expertise will be important to the ATSC 3.0 standard.

Q: Often when we talk about standards, over the long haul the case has been made that it lowers the total cost of ownership, that it has a positive economic impact over the long-haul. Is that the case with MPEG-H Audio and does it matter?

Alan: Yes, I think it matters a lot. It matters to consumer electronics manufacturers who want to keep the cost of their devices down. It matters to independent broadcasters and the big networks, who have experienced operational issues and costs due to proprietary technology entering their workflows and causing quite a bit of pain over the years. We believe MPEG-H Audio is very competitively priced, but beyond that, we believe that the ecosystem that is created by the open standard actually causes price competition at every stage of the value chain.

We think that’s the way the world should work. And that lowers total cost of ownership. If you have a core technology provider imposing themselves in every step of the value chain and instituting service costs to everyone, obviously the total cost of ownership goes up.

Q: Does it lower the cost of ongoing innovation? So as the technology evolves—as it does rapidly in today’s environment—does the fact that you have an open standard lay the foundation for interoperability testing and the absorption of new technology, in a more rapid and fluid, integrated manner?

Alan: Yes, the open standard is key to adoption. If you think about what has to happen in order to ensure that you’ve created an interoperable system you need some sort of test. But should those tests all come from the core technology provider? Not necessarily.

Do we have a set of test bitstreams for MPEG-H Audio? Sure we do, but note that there’s nothing preventing someone else from creating other test bitstreams and selling them in the marketplace. Again, this is how it works in video. There are multiple vendors for test equipment, there are multiple vendors for test software, and having multiple participants in the market obviously creates competition and lowers the total cost of ownership.

Q: As we wrap up, I wanted to ask you what issues you believe are at stake in this decision? What do you think should be on the minds of the folks that will be casting their votes and making a decision on the future of next-generation audio technology?

Alan: Starting with codec performance – just the basic bitrates at which one can run audio services – with MPEG-H, MPEG audio codecs remain the best-performing audio codecs on the planet in terms of high quality at low bitrates. Beyond that, there are interactive features such as audio objects that allow dialog from announcers at a sporting event to be sent separately from ambient audio. So you will be able to interactively mix the crowd and announcers volume to your liking. You might want Spanish-language announcers or to select different announcers for home and away teams. Those sorts of interactive features are very important.

Beyond that, future-proofing the system with next-generation technology such as scene-based audio — HOA as we call it in the technical community – is very important. HOA is exciting new technology that has been proven in MPEG, and will revolutionize the audio industry. And it’s already built into the MPEG-H Audio standard. We believe that having that path to future technologies is another of the many key advantages of MPEG-H Audio.

About lcooper

Lane Cooper has over 20 years of experience as a researcher, reporter and editor analyzing the business and technology industry. On average, Lane meets with 600 CIOs and senior enterprise executives every year to understand the impact of evolving technological developments on organizations of all sizes across all industries. He has organized and moderated many live and online events and works with a variety of high-tech organizations to ensure that information is presented in a context that is useful to an audience of sophisticated technology buyers and implementers. He is a Contributing Content Partner to CIO Magazine, Network World and Computerworld. Other news services and magazines that have carried his by-line include: Voice Report, Network World, Byte Magazine, TechWeb, Optimize, Information Week, Telephony, Communications Week, ComWeek International, and Enterprise Systems Journal. He lives in Washington DC, where he is the Editorial Director of BizTechReports, an independent reporting agency that analyzes user trends in business technology.