Full article

https://forum.ircam.fr/article/detail/automated-3d-audio-control-and-remixing-system-1/

Auteurs

Seungryeol Paik

Presented in

IRCAM Forum Workshop Seoul 2024

Date of Forum

6-8 November 2024

Publisher

IRCAM Forum

Abstract

The growing demand for spatial audio in immersive media, such as virtual reality (VR) and augmented reality (AR), highlights the need for advanced tools that allow for flexible manipulation of complex sound fields. However, existing techniques for remixing and editing spatial audio—especially in high-resolution formats like 5th-order Higher-Order Ambisonics (HOA)—remain technically challenging.

This ongoing research proposes the development of a comprehensive system leveraging deep learning (DL) and machine learning (ML) for source separation, trajectory tracking, and reverberation estimation within 5th-order ambisonic audio environments. The system aims to provide seamless manipulation of individual sound sources (from mono up to 5th-order ambisonics), spatial trajectories, and environmental reverberations, enabling the flexible exchange, removal, or addition of specific audio elements across different spatial mixes.

By offering detailed control over sound sources and their movements within a 3D soundfield, this system opens up new possibilities for spatial audio remixing. The ultimate goal is to develop a system that is both automated and adaptable, capable of addressing the complex audio needs of VR, AR, and other immersive media applications.