
UNSW’s 3DXLab team are exploring the potential of immersive AI-enabled 3D video.
Think back to a favourite memory – perhaps a time you visited a special place with a family member. Maybe you have a photo from that day. Now imagine being able to step into that memory in 3D and physically move through the action as it occurs. That possibility may be on the horizon thanks to a revolution in volumetric video capture supported by AI processing.
UNSW’s 3DXLab’s volumetric capture arena is one of few places in Australia exploring the potential of this technology.
“We’ve built a one-of-a-kind stage in Sydney to capture volumetric video,” says Professor John McGhee, who leads the lab at UNSW’s School of Art & Design.
It’s part of a rapidly developing field, with companies across the world racing to expand its potential applications.
Multi-camera 3D capture brought to life with AI
Volumetric video uses multiple cameras to simultaneously capture motion in 3D space. The volumetric capture arena at UNSW’s Paddington campus uses an array of cameras to record motion, which is then processed using AI to create a 3D rendering of the action.
It differs from traditional 3D filming by capturing the full 3D shape and depth of objects – including people – as they move in real time through space. The recorded object or person can then be placed in a virtual environment with the viewer and appear naturally 3D. The viewer can move around the captured object, seeing it and its movements from all angles with depth. In one example from the 3DXLab, visitors can walk around a 3D recreation of a dancer as she performs and watch her from any angle.
AI-enabled naturalistic 3D video
Until recently, volumetric videos were created without AI using existing photogrammetry techniques. The resulting 3D renders were often imperfect. By using an AI-enabled process called ‘Gaussian splatting’, the team can now create much more realistic surfaces, including semi-transparent materials, improving the experience for the end user as they interact with the captured object.
“Until recently, the technology has been clunky to convert the video captured in a volumetric space. With the advent of Gaussian splatting, which is an AI-enhanced process, everything has changed. It’s all pivoting on AI, because you can’t do Gaussian splatting without AI,” John says.
Where traditional photogrammetry builds meshes based on matching points found in multiple images, a Gaussian splat functions more like a fog that uses machine learning to build the object from millions of blended ellipsoidal shapes as it moves through space and time. The technique is particularly effective at creating natural-looking 3D images because of its capacity to render shiny objects and very fine details, such as violin strings and flyaway hairs.
“With photogrammetry, the resulting image isn’t great. In the Gaussian splat output, the quality is much better.
“The technology opens up so many opportunities and allows us to iterate quickly. AI does the heavy lifting in processing and rendering.”
John says this application of AI is good news for creatives, as it optimises and amplifies existing work and frees up time.
The use of AI to process volumetric video is developing quickly.
“We think by the end of the year, we’ll have another generation of quality, improving the speed of movement, the skin tone.
“We’re likely to see rapid acceleration over the next two to three years. Companies in China and the US are trying to grab the market. For now, to our knowledge we’re the only people in Australia who are doing this at scale.”
From lab to market
The 3DXLab team’s work primarily draws on open-source algorithms to produce the Gaussian splats from its volumetric video. It currently takes several hours using AI to process the data captured in the studio.
“The data is enormous. We’re capturing up to 90 frames per second at full 4k for each camera. You’ve got to crunch that data. That’s why AI has become so useful, because it can rapidly process the data into something that can be used. It makes it more portable, not requiring a lot of infrastructure.”
The lab’s work is now being commercialised through a spinout company, Radiant Jelly, which is developing a cost-effective camera for use in a multi-camera volumetric rig and accompanying software that leverages AI.
Immersive live events and revisiting the past
The depth of the market for AI-enabled volumetric video is yet to be tested. With the development of XR glasses by major tech companies in collaboration with eyewear brands, new markets for volumetric video content are likely to emerge.
Possible future applications for the technology include capturing live events such as concerts to create immersive experiences. It can also be used for simulations for education and training, including for emergency services, and for instruction in complex physical activities such as dance.
The technology also shows promise for archiving, particularly for the recording of historically significant personal stories. For consumers, it’s possible that new iterations of AI-enabled volumetric video will allow us to capture and revisit personal memories of loved ones.
“I think the authenticity of capturing someone as they actually are is really important to humans,” John says.
“I don’t think we want an avatar when we talk to our grandparents. If we capture a memory, we want to look back on how we actually looked and that experience.”
Main image: A violin performance being captured in the UNSW volumetric video arena.
- Log in to post comments