Making Surround Sound & Quadraphonic Music

I’d really appreciate if you could explain the notation you’re using in those formulas, or maybe add a brief verbal description of what each is doing. (I’m sure the notation is pretty standard, but this is a little outside my area and I can’t quite tell what some of it is trying to say.)

Particularly, what’s “[φ_0]”, “[bandpass]”, and “[dolby B+]”, and how do these apply to the term preceding them?

I’m also curious what the 5:2 encoder looks like. (I imagine Ls is encoded the same as 4:2 and Rs is encoded with the two [φ_N] terms swapped? But I’m totally guessing.)

sorry if that is a bit cryptic for others, for me it is not. this is the secret 110 pseudo code from the readme files of my quadrophony plug-ins from 2003, which is more verbose than a simple math formula to give you (no, pardon: me) a better picture of what happens and why.

phi is the “relative phase” and phi 0 means that you only take the real output from the transform. and as you can see i also prefer to think of 270° instead of the more common “-90°”

if you would implement that with max or java “L phi 0 plus R phi 180” would simply be “L minus R”, because a phase shift of 180 degrees can be reached by inversion and adding -x is the same as deducting x…

doby b+ stands for “modified dolby b” and it applies to the formula by adding is wherever you want, i.e. you can put it before or after the weighting “√2 times 0.5” (= 0.707107) which means basically “minus 3 decibel”.

the “modified dolby B” was usually in the analog decoder and consists of a special form of compressor, noisegate, corrected for yet another -3db unless you awere using weaker speakers. you mainly needed it for analog sound in cinemas.

while we are on it, you can also add a haas delay for the surround channel in the decoder. not in the studio, where you sit in the hotspot, but at home, where you sit with the back at the wall.

and there is another thing which i left out and that is the actual “logic” of the pro logic system, which is kind of an auto-exaggeration of the L vs R panning contained in the audio signal. it does not make too much sense for music, imho.


 Lt = L[φ_0] + C*0.5*√2[φ_0] + S*0.5*√2[bandpass][dolby B+][φ_270] 

means that after you recorded or mixed a 4 channel LCRS track you convert it to the left stereo channel for your dolby surround CD by taking the left channel as is, 0.707 percent of the center channel and 0.707 of the surround channel, and each of these channels have to pass a hilbert transform to get those 4 different phase angels before you sum them.

putting the bandpass filters (100-150hz / 7000Hz) in the encoder already is only a proposal of mine, and you can also just leave them out for digitally mixed content and when you later use a digital decoder. in 1984 this was requried because of cheap shit dolby filter components and to make speech better.

damn, i just notice that the times sign in my post vanished and led to italic style.^^ let me see if i can fix this then it might make more sense…

haha, now that i fixed it in code tags the phi φ looks ugly.

these should be the most common quads. there are 11 or 12 but some of the totall suck or there are only 5 recording ever made with it. :slight_smile:

SQ is absolutely incompatible with most others.

Sansui Quadrophonic Sound, Regular Matrix


Lt = L*(cos(pi/8))[φ_0] + R*(sin(pi/8))[φ_0] + Lb*(cos(pi/8))[φ_90] + Rb*(sin(pi/8))[φ_90]
Rt = L*(sin(pi/8))[φ_0] + R*(cos(pi/8))[φ_0] + Lb*(sin(pi/8))[φ_270] + Rb*(cos(pi/8))[φ_270]


L = Lt*(cos(pi/8))[φ_0] + Rt*(sin(pi/8))[φ_0]
R = Lt*(sin(pi/8))[φ_0] + Rt*(cos(pi/8))[φ_0]
Lb = Lt*(cos(pi/8))[φ_270] + Rt*(sin(pi/8))[φ_90]
Rb = Lt*(sin(pi/8))[φ_90] + Rt*(cos(pi/8))[φ_270]

Electro Voice EV-4, Stereo-4


Lt = L*(1.0)[φ_0] + R*(0.3)[φ_0] + Lb*(1.0)[φ_0] + Rb*(0.5)[φ_180]
Rt = L*(0.3)[φ_0] + R*(1.0)[φ_0] + Lb*(0.5)[φ_180] + Rb*(1.0)[φ_0]


L = Lt*(1.0)[φ_0] + Rt*(0.2)[φ_0]
R = Lt*(0.2)[φ_0] + Rt*(1.0)[φ_0]
Lb = Lt*(1.0)[φ_0] + Rt*(0.8)[φ_180]
Rb = Lt*(0.8)[φ_180] + Rt*(1.0)[φ_0]

Dynaco Dynaquad


Lt = L[φ_0] + Lb*(cos(1)/sin(1))[φ_0] + Rb*(1-(cos(1)/sin(1)))[φ_180]
Rt = R[φ_0] + Lb*(1-(cos(1)/sin(1)))[φ_180] + Rb*(cos(1)/sin(1))[φ_0]


L = Lt[φ_0]
R = Rt[φ_0]
Lb = Lt[φ_0] + Rt[φ_180]
Rb = Lt[φ_180] + Rt[φ_0]

Thanks for the excellent explanation!

1 Like

I need a little advice for an upcoming show. My live is set for a 2.1 system at the moment but this venue will have a multichannel setup, so I want to adapt the show to it (the channel count and disposition is not yet designed).

I’m currently using ES-9 interface and since monitoring and hardware sends go through it too, I could have available a total of 4 outputs, 6 adding an inexpensive spdif interface. I’m pretty sure that this won’t be enough.

So in order to add more, and ES-9 being not currently replaceable due to the format, I’m thinking on two options:

  • Add another USB interface with a large output count as aggregate device
  • Add network AVB interface with a large output count

Latency is quite important since there’s also light control involved, so I’m reluctant to add more output counts via USB to an already high I/O setup on the same computer (14 in, 10 out). Network audio may be a more reliable solution in this regard, and AVB is native on macOS. Recommendations for output focused interfaces are welcomed.

A third option is that their system has digital audio inputs but I have no previous experience if this could be expected. Any ideas?

Why do you think 6 outputs is not enough?
I think you can do a lot with six channels. And how are you going to control the surround sound mixing?
I’ve done multichannel live setups and if I want to have real live control over it the ammount of controllers I need often explodes.
I wouldn’t go for aggregated devices for a live setup especially if latency is an issue.

1 Like

Thanks for the advice! This will be my first multichannel gig so I’m not sure what to expect. Knowing the festival, they will put a lot of resources into the sound system of that venue, so I’m wild guessing that I’d need more outs to take the most out of it.

The mix will be done as it’s the current stereo mix: Most part of the mix will be automated, including probabilistic movements, reaction to other sounds or even to the light animations, and in some parts I see fit I get hands on control. The set is quite complex (timeline with looped segments) so I need to simplify certain aspects I don’t want to pay attention to during the concert. I happen to have a few controllers on my rack (then cv to midi), like joystick with built in LFOs (so it’s not just panning around), I use it to control some lights animation parameters but would be quite fun to map it to the surround image in some parts too. I also carry a sensel morph that It’s 90% unused so I can use that too, no need to add more.

I have yet to try how aggregate between USB and network works, but unless is 1:1 to how it works now, I’ll use the 6 outputs the best I can.

Hi I work with a large multi channel sound system and performing and spatialising are kind of mutually exclusive to some degree. Has the festival given you an idea of the size and type of system? 4ch or 8ch or a large ambisonic array? This information will be important in deciding how to address it and what tools /controllers might be appropriate.

To start, there is great value in running stereo over a multichannel surround system, all the left/right movement in the music will seem to wrap around the audience and vertical placement may come from the psychoacoustics of the material itself. That said some low frequency drum material does not naturally pan well, so deciding what you move and what remains fixed would be useful. Stemming some of the material would be a good place to start to figure out what moves well and what doesn’t. I would say that each sound has an inherent range of spatial motion that sounds natural ( of course you can subvert that ). Working out what parts of the material will need to be moved may mean you don’t need to ‘control’ everything.

I think Quad is a great place to start as it is scalable - composing in quad gives you a great sense of surround and transfers well to 8 channel, but of course it misses the vertical.

There are a great number of Ambisonic tools that allow you to compose in binaural and encode it to, in theory at least any sound system. However this takes some setting up, preparation and a super methodical workflow that you may need to manage alongside the performance. Also you will need a mutli-output audio interface. But I would find out from the venue how many channels they are set up to receive and in what format AVB, Dante , ADAT, Analogue???

That said a lot will depend on how much time you have in the space and how much time you will have to configure your output, building an ambisonic decoder for a venue you have not played in is not ideal. If it is an ambisonic gig then usually they will provide you with something…

Once you have more info it and decided how you want to approach it then you can get to the nuts and bolts of what tech you need and how you will approach it.

I agree with the former replies too.

Hope that helps.


Thank you so much for the thoughtful answer, there’s a lot of useful information in there!

The live set is already structured by parts and stemmed by hw ins, returns and a few samples in Ableton, Ableton acts mostly as an automated mixer and envelope follower CC sender for light reactivity. The side content is very prominent during the whole set, so it’s nice to know it will translate well on a multichannel system in case I don’t make it.

Not sure what you mean here by super methodical workflow. Can you please elaborate a bit?

If you (and the rest) don’t mind I’ll update once I get more information about the system specs. Thanks again!

That quote was referring to the move from channel to object based panning of ambisonics , which can get confusing but it’s dependent on what you are doing.

Some useful things in this video for Ambisonic stuff.

As I think this through there is a lack of ‘how to’ material in this area.


It might not help you a lot but here it goes :

Do you think of multiple channels because you want a complex, layered, surrounding space, or because you want complex, precise, surrounding movements ?

I always feel like the space part is easy : if it is in your stereo, send that to different speakers and it works itself out (almost), the different frequency responses and different room placement tend to enhance different parts of your mix, and when the mix changes the sound seem to move by itself in the room. It is more pronounced with heavily layered mixes than with minimalistic mixes.

The movement part is harder, because it would mean automation (played or written beforehand) and it would absolutely require dedicated soundcheck time, because once in the concert space everything goes wrong as far as beforehand written movement goes… So you’ll have to try and adapt (find in which speakers the sound you want to move works better, and see if when you move it it does not get lost at any point of the movement (something that kind of worked at times for me was using 0dB pan law with a specific speaker pair, paired with a specific bus in the DAW, used only for those movement that had to separate themselves from the static(ish) mass of the rest of the music, but the pan law decision is room dependent because of speaker placement -etc-, hence the soundcheck time…).

One thing that I found useful was concentrating on a few moments where “multichannel” was obvious, and the rest of the time I concentrated on relative levels of different frequency contents to achieve a complex space, which in my opinion helps focus on what is musical (levels) and not just an afterthought effect (placement).