MPAI: a new organization for AI-based data coding standards

The most fascinating aspect of media coding, considering the capillary penetration of media communication today, is its social and cultural impact. Think of mp3, the file format to encode audio data that we are all familiar with, and that we probably use on a daily basis to listen to music. The mp3 file format enabled a large variety of new applications, based on the fact that sending, exchanging, and embedding audio now required a few megabytes instead of 50 or more megabytes even for a relatively short audio. Developing a new file format is a technical challenge, but its consequences are far reaching.

However, even mp3 was not born in the void and somehow established itself as a standard. Behind its development, there was a vision. And there was an organization supporting this vision. The organization was founded in 1988 and it was called Moving Picture Expert Group – or MPEG, which yielded the name to its standards for audio and video compression and delivery, from MPEG 1 to 21 and beyond.

Why standardization matters, is because file formats are meant to enable the exchange of media content. If the sender and the receiver, and their technology, do not share the same formats, what is being transmitted is a stream of meaningless bits. In order for the technology to support a format, it needs to be adopted by all the layers of the infrastructure that make the communication possible. This requires much more work than the technical challenge of developing an algorithm for data compression: it is a slow and strategic technical work, with occasional streaks of politics.

Courtesy of MPAI

We owe the vision behind MPEG to Leonardo Chiariglione, a man who clearly understands what the ultimate goal of this technical and political work is. Chiariglione is an engineer, an innovator, a leader, but most importantly he is someone who has “devoted his life to enable people to communicate through technology without obstacles” – where the obstacles are technological and often driven by politics and economics. Hence the huge importance of an organization like MPEG, that by introducing new standards for audio and video coding and delivery based on technological excellence has changed the scene of communication as we used to know it, producing an incalculable cultural impact over the past thirty years.

Artificial intelligence meets media coding

Today, Chiariglione is involved with a new organization called Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI). Two important keywords appear in the name: Artificial Intelligence. What’s new? Chiariglione explains that at the base of most work inside MPEG lied the calculation of the statistics of the signal, which would be fed to the machines to perform their tasks. The information about the signal would be known a priori and the machines would only execute what humans had programmed them for. With AI, it is now possible and practical to feed the machines with the relevant data directly. The machines “learn” from thousands and thousands of examples, and extract the relevant information. The more they learn the better they to execute their task.

“The match of media coding and AI was made in heaven,” says Chiariglione. Coding is still an important keyword in MPAI, but it now extends to understanding the signal besides compressing it. Applications include adding an expressive intention to a speaking voice, personalized automatic speech translation, a server-based predictive multiplayer gaming service, and processing a company’s data to predict its future performance. How are these applications achieved?

MPAI is not a research center and does not develop technology per se. The core mission of MPAI is to establish new standards and encourage their adoption. MPAI publishes “calls for technology” open to anyone across the world, both in academia and the industry. MPAI examines the submitted technologies and selects those that perform best and integrates them into standards. Currently MPAI is the only standard organization focusing on data coding using AI as the core technology.

“Artificial intelligence is here to stay,” predicts Chiariglione, who has proven to have an infallible sixth sense for the future of technology and society. Artificial intelligence will impact ever more aspects of our lives, and it is crucial that our future is designed by someone who understand that the ultimate goal is to connect people through technology “without obstacles”.

Federica Bressan is a researcher and science communicator. She holds two MDs in Music and Musicology and a PhD in Computer Science. The vision underlying her work concerns the co-evolution of technology and culture. As a Marie Curie and Fulbright researcher, she has published 30+ peer-reviewed articles, chaired international events, and guest edited a special issue of the Journal of New Music Research. As communicator, she conducts video interviews and hosts the podcast Technoculture, and writes about science and society. Visit Federica's podcast at: