A new dimension of sound: OZO Audio interview
We recently announced that Nokia 8 is the world’s first smartphone with OZO Audio. But what exactly is OZO Audio and why is it so critical for content creators and end users? We saw down with Kai Havukainen, OZO Audio Product Manager, to talk about spatial audio and why it’s so important towards achieving a true feeling of immersion.
What is unique about the capture technology for OZO Audio?
With OZO Audio, the recording algorithm analyzes the signals from all of the device microphones and generates parametric representation of the audio environment. Since our experts have tuned the algorithm based on the device’s acoustic characteristics, it is able to act as your ears at that position. Meaning it captures all of the sounds around the device just like your head would at that very same spot. The listening experience is very natural, without any noticeable artificial enhancements that some other spatial audio capture solutions use. Sound captured with OZO Audio is then stored into a highly data efficient format that can be played pack by any video player – even if it doesn’t support OZO audio capture.
Why is 360 audio key for user generated content on mobile devices and for VR?
We want to enable any device to capture audio in its most natural form. Spatial audio is particularly important to achieve a true feeling of immersion in VR (some even say that it’s half the experience!). By enabling high quality spatial audio capture in consumer devices, we give mobile phone users the ability to capture audio that helps convey a similar feeling of immersion that best VR cameras provide.
How/when did OZO Audio come to be? Why was it created?
The OZO Audio spatial audio capture technology was initially created for a variety of use cases across various Nokia products. The core technology has actually been developed during the last 20 years ago, and today we are able to apply it across devices of any size, shape, and amount (2 or more) of integrated microphones.
As far as recording, how is OZO Audio different from “normal” audio recording methods?
OZO Audio is a spatial audio technology. In layman’s terms, it enables capture and playback of audio so that you can hear all the sounds around you coming naturally from the direction where they really were when recorded. For content creators, capturing OZO Audio is like any other audio recording technique. You press record and our algorithms take care of the rest!
For those using it on a smartphone or camera - OZO Audio technology also lets you focus the audio recording area, for example to the front or back side of the device. We call this “Audio Focus.” Depending on the particular device, a user can simply tap a button to change the recording mode between normal spatial audio and Audio Focus.
What does all of this mean for content creators?
Many devices on the market today still have relatively poor audio capture capabilities or mono sound. Whether you are capturing a quick video from a family vacation, to share on social media, or something more post-produced, we believe that truly immersive audio always takes the watching and listening experience to the next level.
Think about recording a scene outdoors with a VR or 360 camera where a surprise happens on the opposite side of the watching direction. Without spatial audio, the user doesn’t really know where the sound is coming from and therefore where to turn their head (or even if they should turn). OZO Audio also natively supports head-tracking if your video content is 360 video.
What does this mean for consumers of VR content? How is OZO Audio superior to normal audio?
With OZO Audio, users get a more natural listening experience. Put on headphones and a head-mounted display and you will really feel like you were in the scene. The spatial, or surround sound, experience in OZO Audio is created by binaurally processing the playback to your headphones. This means that you don’t hear sound inside your head like with typical sound recordings, but coming from outside the head like they do in real life. Head-tracking for audio is naturally supported for VR content as well.