Nokia is leading the way towards immersive media in MPEG standardization
The digital world is quickly evolving towards more immersive experiences, such as virtual reality (VR), mixed reality (MR), and the metaverse. Virtual meetings in 3D, our favorite artists performing in our own living room as lifelike holograms, and immersive virtual tours of our planned holiday destinations, are all exciting examples of how the advancement of digitization will soon impact our lives.
However, it is important to remember that none of this would be possible without standardized formats which are the result of decades of intensive research and vast collaboration among inventors and developers. Without these open standards, devices and systems from different vendors would not work together, and much of the value that modern technology has brought to the world would not have been created.
The Moving Picture Experts Group (MPEG) is renowned for the development of numerous audio, video, and media systems standards since the 1980s. In the recent years, MPEG has developed the MPEG-I family of standards for immersive media. This blog is an overview of MPEG-I and how Nokia has driven its standardization.
Immersive experiences combine the physical and virtual
Immersive media is an umbrella term used to describe technologies that combine virtual and physical environments enabling interaction between humans and machines. It involves a spectrum of dimensions and nuances, but a coarse categorization can be made to mixed reality (MR) and virtual reality (VR).
MR fuses the physical and virtual worlds in a seamless manner. For example, people who are far away can join physical gathering as hologram-like characters from a distance and interact as if everyone was physically present. Such an experience requires very low end-to-end latency, which can be achieved by 5G-Advanced mobile networking technology.
VR immerses the user into a world that has been captured or synthetically created in advance and enables services such as virtual tourism. VR experiences may range from three-degrees-of-freedom omnidirectional media, where the user can peek around in a virtual environment from a single viewing position, to six-degrees-of-freedom volumetric media, where the user can also freely select his or her viewing position. Latency requirements of VR are often less stringent than those of MR.
MPEG immersive media standards
The ISO/IEC 23090 standard family, also known as MPEG-I, contains media compression and systems for representation and carriage of immersive media. The MPEG-I standards facilitate a wide variety of interoperable VR and MR services. As of today, the following ten International Standards of this family have been published or technically finalized, i.e., reached the Final Draft status:
- Part 2: Omnidirectional media format (OMAF)
- Part 3: Versatile video coding (VVC)
- Part 5: Visual volumetric video-based coding (V3C) and video-based point cloud compression (V-PCC)
- Part 6: Immersive media metrics
- Part 7: Immersive media metadata
- Part 8: Network based media processing (NBMP)
- Part 9: Geometry-based point cloud compression (G-PCC)
- Part 10: Carriage of visual volumetric video-based coding data
- Part 12: MPEG immersive video (MIV)
- Part 14: Scene description
OMAF specifies the media format for coding, storage, delivery, and rendering of omnidirectional media, including video, audio, images, and timed text. It is mainly targeted at facilitating interoperability of VR content, devices, and services. Among other things, OMAF specifies metadata for 360° image and video files so that all entities from content authoring to playback can use the same VR metadata and are interoperable with each other. MPEG-I Part 7 collects some OMAF metadata for potential usage beyond file storage and streaming and is planned to be extended towards MR.
VVC is the latest video coding standard with superior compression performance. While VVC suits any video content types, it was developed with specific focus on 360° video. Moreover, VVC achieves very low end-to-end delay since all VVC decoders include a feature known as Gradual Decoding Refresh (GDR). I have introduced VVC in my previous blog and subsequently I also discussed its benefits over earlier video coding formats.
V3C is the base format for patch-based volumetric video representations. Any video codec, such as VVC, can be used as the underlying compression technology for V3C. The V-PCC format builds on V3C and is used for compression of three-dimensional (3D) moving point clouds, such as a lifelike 3D person capture. More information on V3C and V-PCC is available in the blog by Sebastian Schwarz. Another format using V3C as the basis is the MIV standard, which enables six-degrees-of-freedom omnidirectional video. To complement the V3C, V-PCC and MIV compression formats, MPEG-I Part 10 specifies their file storage and streaming carriage. While V-PCC uses any video codec as underlying compression technology, G-PCC specifies a compression format dedicated to point clouds.
Parts 6 and 8 of MPEG-I take part in creating network-enabled immersive services. NBMP provides interfaces to set up media processing pipelines in the network. MPEG-I Part 6 specifies quality metrics for immersive media, which may be used for measuring and optimizing a service delivered over a network.
While MPEG leads the development of standardized audio-visual compression technologies, there are other organizations developing different aspects of VR and MR. The Khronos group has published the glTF format for representing 3D scenes and models. MPEG-I Part 14 specifies how MPEG media codecs and MPEG container formats are integrated to a glTF model.
Nokia is the leader in MPEG immersive media standards
Nokia is committed to licensing its patents essential to the implementation of MPEG-I standards under RAND terms. We believe in a fair licensing approach that strikes a balance between the needs of those who develop and contribute technologies to standards and those who implement them.
According to the ISO patent declaration database, 36 companies or organizations have provided patent statement and licensing declarations against the published or technically finalized MPEG immersive media standards.
Nokia has provided patent statement and licensing declarations against 8 out of the 10 published or technically finalized MPEG-I standards. This exceeds the number of standards against which any other company or organization has declared by a clear margin.
Nokia’s patent statements against the MPEG-I standards are a clear indication that our research and standardization efforts on immersive media have the broadest coverage among all participating companies and organizations. Furthermore, Nokia's innovations have been widely incorporated in the MPEG immersive media standards.
But the work does not stop. The MPEG-I standard family keeps on evolving. For example, the Immersive Audio standard project has recently entered its collaborative standardization phase where Nokia will once again play a leading role. We continue our extensive research, technology development and standardization efforts on immersive media to help create a more sustainable, more efficient and a more enjoyable world.