Making Surround Sound & Quadraphonic Music

Is anyone here producing music in surround sound?

I’ve got the hardware (quad monitors, 5.1 hifi system, binaural mics etc) and the software (puredata and reaper). However I’m stumped on what format to master too and how to playback and distribute it.

Any tips, advice ideas etc. appreciated.

2 Likes

I work in surround a good deal. Depending on the circumstances, sometimes using Pro Tools (I have an older version with a production package that allowed for surround) or Logic, or a kind of cheat system using Ableton - though if I would learn to use https://mntn.rocks/ it could allow a more functional surround from Ableton.
Since you’re using Reaper, this might be a useful tutorial:
https://www.youtube.com/watch?v=4EUm18qHZa4
Generally for format, 24-bit/48 khz
For playback, if it’s an installation, I’ll either burn a disc to DVD-Audio (very obsolete format but allows for 6 channels of high resolution playback), or DVD with Dolby, or from a Mac Mini and multi-channel interface.
Another playback option is a Tsunami from Sparkfun. Or, from Germany, there is a thing called a WavPlayer8
http://www.memsolution.com/
The thing I still struggle with is creating good sub tracks.
I hope something there is useful…

7 Likes

Thanks stephen, very useful.

DVD with dolby encoding is be a good option, I’ve already got a burner and a player. I suppose for internet distribution a video format with dolby embedded would work too. That mntn software looks interesting.

(Sorry if you know this already) with Dolby, I believe Left and Right will be 16-bit, C, LS, RS are all compressed.
Is there a way to stream multi-channel online? A post from Vimeo from a few months ago says no. That would be good if so

i would distribute it primarily in binaural, as everyone has headphones, and (comparatively) almost no one has a proper 5.1 installation.
It seems that Radio France have developed a player streaming 5.1 (exemple here) (i remember someone from their team saying their next move would be a player applying dynamic compression on binaural streams instead of just raising the volume, so that mobile users in noisy environments can get details without needing insane headphone level, but it seems to have not happened yet).

1 Like

Back when I got started in surround - approximately a decade ago - commercial DTS surround discs were distributed on CD, which can be played on any home theatre system with digital output from the CD player to the surround input (coaxial or fibre optical). So, I licensed a copy of the DTS encoding software for PowerPC Mac and put together a handful of surround pieces.

Locust released an album, Wrong, as two CD discs intended to be played simultaneously on two stereos. You’re supposed to press Play on two stereos at the same time. This is a unique surround composition because the “rear” stereo channels are designed so that they do not need to be precisely time-aligned with the main “front” stereo channels, which can also be listened to on their own as normal stereo tracks.

As an authoring experiment, I ripped the two discs and converted them to the format needed by the DTS encoder. I now have a single CD that will play in quad on my home theatre without all the hassle.

These days, I assume that the DTS music CD format has disappeared. I’m not really sure what is preferred for distribution. The surround mastering facilities have 8-channel tape formats with standard channel assignments for 5.1 audio, but that is not something consumers can handle. It still seems to be very much the realm of experts. I can’t help but think that home theatre standards are the way to go, especially with Blu-ray and their continued support for DTS. I don’t know where you’d find a modern DTS encoder, but I haven’t been looking, either.

As for the creative and production side of the process, I’ve worked exclusively with artists performing their own work. I then produce the surround version by either simply producing a quad arrangement with stage sound in the front and room ambience in the rear, or when I am allowed to record in multitrack format I will mix to 5.1 in a fairly static panning.

I’ve considered the extremes of surround styles. At the most extreme would be the LSD-influenced live panning of synthetic sounds across the entire space. That’s certainly valid, but I’m not so interested because it sounds more like surround as a gimmick. The more mundane approach is simply attempting to recreate the surround experience of the venue where a live performance has occurred. In these cases, you’re not really aware of the surround as a gimmick, but are still enveloped.

When consulting with artists like Portable, who put on a surround performance here in Seattle more than a decade ago, I have been inspired by techniques used in early stereo recordings. In those days, stereo did not exist, and there were no pan pots on mixers. Thus, early Beatles records had instruments panned either hard left, hard right, or pure center (particularly important vocals). I’ve noticed that modern artists like Stereolab will use these simple panning techniques with one organ in the left and a different one in the right, the same with individual guitars. My approach with surround is to take stereo tracks and place them either in the front, the rear, or both at equal volumes. This is fairly easy to pull off in Ableton where you can assign tracks to outputs, including the ability to mix a third output pair into both front and rear pairs. This offers enough variety to fill a surround space without requiring too many joysticks for two hands.

Brian

3 Likes

HI,

wondering if you’d checked out Ambisonics. most of the software is free these days and works
really well with e.g. Reaper. The advantage of making an Ambisonic mix is that it doesn’t define the equipment or setup that the listener needs to use. It’s relatively straightforward to listen to it using a binaural, normal stereo, 5.1 (or arbitrary number of loudspeakers / geometry) sound system. The files are usually mixed / delivered / shared as a 4-channel “B-format” file. There are some streaming / compressed formats too. It was for a long time the preserves for sound nerds like myself but it’s really taken off now as a component of VR production because of its ability to do 3d sound with head tracking.

check out, for instance:

http://www.ambisonic.net

http://www.ambisonictoolkit.net

x gus

1 Like

Not really, but I’ve attended a few workshops on the subject and I would be all over the IRCAM SPAT external for Max if I really wanted to get into it.

That or just quad outputs from the eurorack :wink:

1 Like

I really like ambisonics, conceptually, but I have always had a certain amount of doubt and even recently heard confirmation of its limitations.

As a basic concept, the idea of converting between Left/Right pairs and Mid/Side pairs makes sense. One can use an omnidirectional microphone and a figure eight to capture the “same” signal as a left-right pair, and some “stereo” microphones are actually physically made from a mid-side pair with active electronics on phantom power to handle the conversion. It’s also popular to convert stereo pair tracks to mid-side for processing in a way that shouldn’t disturb the stereo image as much (by operating only on the mid signal).

Once you understand the two-channel version, extrapolating to the typical 4-channel ambisonic WXYZ signal set makes sense.

While it makes sense that processing can pull 5.1 or 7.1 or even 10.2 out of an ambisonic quad, I have always had my doubts about how much information can really be in 4 channels, no matter what the arrangement of those channels or the subsequent processing. How much information can be in those 6, 8, or 12 channels if they originated as only 4 encoded channels, and are the larger channel sets rise enough to be considered full?

In a recent conversation where ambisonics came up, it was pointed out to me that the biggest limitation of ambisonics is that it represents only a single point recording. While it does encode a great deal of directional information, it is still limited to that one point. Recreating a complex, three-dimensional sound field at multiple points requires something beyond ambisonics.

Going back to the example of a mid-side pair, a serious limitation there is that inter aural delays are lost when using a single omni plus a figure eight. Yes, you can get a stereo image after conversion, but it’s less accurate than a binaural recording with spaced, directional microphones. The same issues are present in an ambisonic recording, because we listen with two ears that we can turn to face selected sound sources while the ambisonic recording gear captures only a single point without the time delay information that would result from two or more points relative to multiple sources.

If anyone can point to papers on these specific concerns, I’d be very interested in updating my knowledge of ambisonics.

" How much information can be in those 6, 8, or 12 channels if they originated as only 4 encoded channels, and are the larger channel sets rise enough to be considered full?"

Obviously there is a limit to spatial resolution. In my experience, 6 speakers work very well in a smaller space for a 2d surround rig. 8 speakers in a big space. If you want 3d, then 8 or 12 seem to work well. The resolution problem is what has led to the development of 2nd and 3rd order ambisonics - sampling the “sphere” of sound at a higher resolution, using more channels. Many current versions of plug-ins can deal with higher order.

" the biggest limitation of ambisonics is that it represents only a single point recording."

  • sure. But your head is a kind of point too, no? - but seriously - you’re right: if you want to localise a sound accurately and absolutely at one point, it needs to come from a single loudspeaker.

“Going back to the example of a mid-side pair, a serious limitation there is that inter aural delays are lost when using a single omni plus a figure eight. Yes, you can get a stereo image after conversion, but it’s less accurate than a binaural recording with spaced, directional microphones”

I don’t agree. Binaural doesn’t work well for some people, and binaural without head-tracking can be unstable. Our brains use any and all of the information they can get to decode space and location - whether that’s time differences or level differences. Also, the interaural delays are not “lost”, in fact they are partly re-created by the loudspeakers - if the sound comes louder out of the right loudspeaker then that sound reaches BOTH ears, but hits the right ear first.

If anyone wants some fun looking at stereo techniques you should check out Alan Blumlein’s patents from the 1930’s. this is relevant, for instance:

http://www.pspatialaudio.com/blumlein_delta.htm

xx gus

not sure what the current status of the project is, but i played with it a little bit when it first came out:

http://spatium.ruipenha.pt/

using my sound card and 4 amps in a quad config i was able to achieve some cool panning.

///

if anyone hasn’t experienced the amazing cornelius dvd “five point one” i highly recommend it. a masterwork of spatial music and the videos are some of my favorites.

5 Likes

[quote=“gusp, post:10, topic:8260”]Also, the interaural delays are not “lost”, in fact they are partly re-created by the loudspeakers - if the sound comes louder out of the right loudspeaker then that sound reaches BOTH ears, but hits the right ear first.
[/quote]
You have some good points in your reply, but I have to correct these mistakes. Louder sounds do no travel faster, so they do not recreate the original time delays. The fact that the right speaker’s sound hits your right ear first helps you localize the speaker itself, but not the direction of the original sound.

[quote=“gusp, post:10, topic:8260”]sure. But your head is a kind of point too, no? - but seriously - you’re right
[/quote]
You imply that you’re not being serious, which may explain why I did not understand what you’re saying here. No, your head is not a kind of a point. I was very specifically referring to the fact that the human hearing system is a two-point measurement system, where time, amplitude, and HRTF differences help us localize unlimited sound sources that may surround us. A single-point recording system cannot capture all of the information that we use for localization, although I admit that it can get quite close and perhaps that’s good enough for many situations.

2 Likes

[quote=“rsdio, post:12, topic:8260”]
but I have to correct these mistakes. Louder sounds do no travel faster, so they do not recreate the original time delays. The fact that the right speaker’s sound hits your right ear first helps you localize the speaker itself, but not the direction of the original sound.

Ahh!! why didn’t I think of that! Of course, it’s the IAD of the speaker that you hear. I read that in something that Blumlein said… Luckily our hearing mechanism is robust enough to use any clues we get.

don’t get me wrong: I love binaural. For headphones it’s fantastic. Just that in my opinion it’s not so applicable / portable to different kinds of soundsystems as co-incident mic-ing techniques. And amplitude panning is of course equivalent to co-incident mic-ing.

Anyway, back to the thread: Maybe an interesting approach to electronic modular music could be to apply control voltages that you are using anyway for musical gestures / changes to move the sounds around. You could use VCA’s for this, or even (to go the binaural route :wink: use a voltage-controlled delay line to shift the sounds in space.

x gus

1 Like

[quote=“gusp, post:13, topic:8260”]Luckily our hearing mechanism is robust enough to use any clues we get.
[/quote]
Not to be too pedantic, I hope, but the various clues we get are not all applicable in all cases.

Interaural delay works for low frequencies, but as frequency increases this clue becomes less useful. Eventually, more than one period of the waveform occurs in the distance between the ears, and our brains are powerless to calculate the delay.

Amplitude differences work for high frequencies, which are directional, but lower frequencies bend around obstacles like our head without much, if any, SPL loss. Thus, our brain gets no clue from low frequency amplitude changes, at least not for natural sounds.

The height cues are extremely individual, which is why so much work is being done with HRTF measurements. Although the clues are there, it’s difficult to take advantage of them in listener-independent ways.

So, yeah, our brains are fantastic computers that utilize many clues, but those clues aren’t always in play for every sound.

It is true that binaural is challenging, and doesn’t work everywhere. My point was not that binaural is easy, but that our hearing is working in binaural mode all the time, whether that’s easy to fool into surround perception or not.

[quote=“gusp, post:13, topic:8260”]Maybe an interesting approach to electronic modular music could be to apply control voltages that you are using anyway for musical gestures / changes to move the sounds around. You could use VCA’s for this, or even (to go the binaural route :wink: use a voltage-controlled delay line to shift the sounds in space.
[/quote]
Great ideas.

Apple’s CoreAudio offers a 3DMixer that calculates the delay, amplitude, and maybe even some height cueing into a mix. You’d have to work in the digital domain, at least partially, but it would even be possible with a multichannel audio interface to input a number of analog sound sources and pan them around within CoreAudio before outputting them onto a stereo or quad output. CoreAudio allows the user to edit the position of the available speaker channels and will convert the surround stream to fit what’s available. I have not explored these options fully, yet.

When I worked with Alan Abrahams in quad, he found some Ableton effects that could be synchronized to different phases of sine or triangle LFOs, and this produced an automated swirling effect in surround that was quite effective. Now that Ableton has Max/MSP integrated, it might be possible to patch various control signals that have precise phase relationships across the various surround channels.

1 Like

I do quite a lot of surround mixing for audio . We encode to a Dolby format true HD

Which can be played back direct from USB stick on various domestic receiver amps.
I always mix multi format in Protools , but there are various great Max for Live patches if able to is your bag . I use them quite successfully for surround live perfomances

1 Like

What is the authoring software used to do Dolby true HD? Sorry if it’s described in that post and I’m not understanding. In the past, I would export 5.1 from Pro Tools and then process AC-3 through Compressor but would be glad to move on from there. Thanks

Here’s a link to a Soundcloud playlist with 2 recordings of an experiment I did last year with a quadrophonic patch on a Nord Modular G2. One recording is a dummy head recording of the patch being played using 4 speakers (square arrangement), the other file is the same audio, but recorded via the line outs via a splitter and then mixed using Logic’s Binaural panner. Headphones required.

If you can spare the time, please let me know if any of you experienced anything significantly beyond stereo.

2 Likes

Dolby media producer , I believe . I get a mastering house , super audio mastering to do the conversion . I thought Compressor died a few osx versions ago ? And it only did heavily data compressed Dolby as I remember

OK thanks! And, yeah, I only do the Compressor version when absolutely necessary.

A thread specifically about binaural recordings, playback, virtual or real…

I make a lot of use of binaural recordings in projects. I make my recordings with a pair of DIY binaural mics i’ve build myself. I forgot what kind of capsules i’ve used but they sound really good. And with my Sony D100 i can make recordings with a lot of headroom.
There is one thing i encounter a lot. That is that sounds which were recorded in front of the recordist (me usually). Are lots of time perceived to be on the rear for the listener.
I wonder if other people experience the same thing with binaural or that it’s just my mics or maybe the shape of my ears used for recording.