Toward Wave-based Sound Synthesis for Computer Animation


Advances in computer-generated imagery have brought vivid, realistic animations to life, but the sounds associated with what we see simulated on screen, such as two objects colliding, are often recordings. Now researchers at Stanford University have developed a system that automatically renders accurate sounds for a wide variety of animated phenomena.

“There’s been a Holy Grail in computing of being able to simulate reality for humans. We can animate scenes and render them visually with physics and computer graphics, but, as for sounds, they are usually made up,” said Doug James, professor of computer science at Stanford University. “Currently there exists no way to generate realistic synchronized sounds for complex animated content, such as splashing water or colliding objects, automatically. This fills that void.”

We explore an integrated approach to sound generation that supports a wide variety of physics-based simulation models and computer-animated phenomena. Targeting high-quality offline sound synthesis, we seek to resolve animation-driven sound radiation with near-field scattering and diffraction effects. The core of our approach is a sharp-interface finite-difference time-domain (FDTD) wavesolver, with a series of supporting algorithms to handle rapidly deforming and vibrating embedded interfaces arising in physics-based animation sound. Once the solver rasterizes these interfaces, it must evaluate acceleration boundary conditions (BCs) that involve model and phenomena-specific computations. We introduce acoustic shaders as a mechanism to abstract away these complexities, and describe a variety of implementations for computer animation: near-rigid objects with ringing and acceleration noise, deformable (finite element) models such as thin shells, bubble-based water, and virtual characters. Since time-domain wave synthesis is expensive, we only simulate pressure waves in a small region about each sound source, then estimate a far-field pressure signal. To further improve scalability beyond multi-threading, we propose a fully time-parallel sound synthesis method that is demonstrated on commodity cloud computing resources. In addition to presenting results for multiple animation phenomena (water, rigid, shells, kinematic deformers, etc.) we also propose 3D automatic dialogue replacement (3DADR) for virtual characters so that pre-recorded dialogue can include character movement, and near-field shadowing and scattering sound effects.


Interesting - will have the read up more.

The is a decent amount of tooling to drive animation from audio (within the animation and vfx industry) but not the other way around.


Interesting; I like the continued movement towards procedural audio, though I imagine the best solution will be a combination of manipulated samples and procedural synthesis (kind of like how with a lot of film scores the important instruments are recorded and the rest sampled/synthesized). Also super curious about procedurally synthesizing unlikely/impossible scenarios, and how this plays out in 3D spatialized scenarios like ambisonics or wave field systems.


This is really cool, and the future! I mean at least for VR this will be a requirement sooner or later…

It would be interesting to have some sort of crude realtime wave simulation, enabling infinite ways of creating instruments… :wink: