Digital Allpass Filters

I’m studying various reverb and modeling algorithms and I keep coming across various references to different varieties of allpass filters. Everything I’ve found by Googling doesn’t really seem explain how to design them or parameterize them.

Can anyone point me to any references that explain how to design and then program efficient allpass filters in C, preferably “from scratch”?

I’m also interested in how to get delay times (if I understand correctly, there’s a delay time as a key parameter of an allpass filter, right?) that are fractions of a sample.


I’m also very interested by the topic! I’ve been playing around with reverb designs lately and I’ve had a hard time getting allpass filters to sound interesting.

Julius Smith provides a good intro breaking down an allpass filter as two combs: He shows the efficient way to implement an allpass with a single multiply at

A more application-oriented presentation from Jon Dattorro is here:

The phase response of a simple allpass delay is not very flat, but it’s flattest between around 0.6 and 1.6 samples of delay. So to get a delay of an arbitrary number of samples you put an integral delay line in series with your 0.6 to 1.6 samples of allpass such that the total delay is what you want.

The allpass implementation in my own DSP library is here: I think it’s readable code. This is in C++ but you could easily translate it to straight C.


at risk of stating the obvious, you can also use linear or cubic interpolation for this.

for fractional delay D

di = floor(D)           // integer part
df = remainder(D) // fractional part in [0,1]


x(D) = interp_lin(x(n-(di+1), x(n-di), df)

interp_lin(x0, x1, f) {
   return x0 + (x1-x0)*f


x(D) = interp_cubic(x(n-(di+2)), x(n-(di+1), x(n-di), x(n-(di-1)), df)

interp_cubic(x0, x1, x2, x3, f) {
   // x-form hermite
   return (((0.5*(x3 - x0) + 1.5 * (x1 - x2)) * f + (x0 - 2.5*x1 + 2*x2 - 0.5*x3)) * f + 0.5*(x2 - x0)) * f + y1

an allpass interpolator has one major benefit: it introduces no gain distortion. therefore it is basically the only choice for tuning lossless feedback loops.

major downside is that the delay time is frequency-dependent.

so for approximate delay time D, the allpass difference equation:

x(n-D) = c*x(n) + x(n−1) - c*y(n−1)

gives D ~= (1-c) / (1+c)


  • this relationship is only exact at DC (more phase delay for higher freqs)
  • tricky to use if you want to modulate the integer part of the delay time
  • doesn’t really work if you want random access (only sequential)

I suppose I should have clarified: I was talking about allpass filter delay times, not ordinary delays! I’m interested in allpass filters for reverb, and those use fractional inter-sample delay times - but they are not simply inter-sample delay lines, since those introduce comb filtering. Thus the question…

Also, even a cubic algorithm does not accurately estimate inter-sample peaks, though it’s better than linear. Convolution with a sinc() would be necessary for perfect reconstruction, but that’s rather expensive cpu-wise.

But yes, by definition an all-pass filter would have a different phase response per frequency - to my knowledge this is the entire point of using such a filter, no? To disperse the phases of the frequencies without introducing a comb-filter-like frequency response?


Are you sure? Schroeder’s reverb (see Julius Smith’s CCRMA page), uses a bunch of allpass filters with integer length delay lines…

1 Like

Yea this is one case of what I mean, u can kind of factor the delays and decorrelations out separately

Greisinger’s, as reported by Tom Erbe in his slide deck, predelayed with non-integer values. I recall seeing (but don’t have handy) non-integer values inside either Erbe’s own design or a design I saw built off it, too.

@equipoise i feel like i must have also been insufficiently clear. i tried to answer the original questions. but on re-reading them i think i failed. JOS is an excellent theorist, and Tom Erbe is a great person, developer, and mentor; and all the reference materials linkd so far are right on point. but sometimes it’s nice to start from the beginning to make sure everyone is on the same page.

(and look, i know this is basic, but to be clear to anyone following who doesn’t want to click through JOS)

first off, what is a digital allpass filter?

well, functionally it is a filter with a flat magnitude response. which is, eh, quite vague. algorithmically, usually means a filter graph with an equal number of poles (feedback terms) and zeroes (feedforward terms), where the pole coefficients and the zero coefficients are complex conjugates. but since we’re talking about audio, these are real numbers.

its very common to encounter first-order allpasses, (or let’s say APs), and from here on we’ll assume that’s what we’re talking about first-order, meaning one pole and one zero.

the graph looks like this (sorry for bad ascii art):

	|           |
        |           V
x(n)--+---> z_1 ----+---> y(n)
      ^           |
      |           |

and difference equation:

y(n) = a*x(b) + x(n-1) + b*y(n-1)

and transfer function:

(z) = (a + z_1) / (1 + b*z_1)

a is the pole coefficient and b is the zero. for the allpass condition, a = b. so there is only one coefficent, which i’ll just call c. super easy!

I’m also interested in how to get delay times (if I understand correctly, there’s a delay time as a key parameter of an allpass filter, right?) that are fractions of a sample.

if you think of an allpass as a delay element, then the delay time is always less than one sample for a first-order AP, or i think it’s fair to say less than N samples for an Nth-order AP.

so in this context, think of an allpass as a “tuning” element in series with an integral delay line.

Can anyone point me to any references that explain how to design and then program efficient allpass filters in C, preferably “from scratch"

this was done a bit already, but to be really specific and without a lot of baggage:

D = (1-c) / (1+c) , where D is the delay time in samples, in [0,1].

it follows that c = (1-D) / (1+D).

and that is literally the only design parameter you get with 1st-order allpass.

with higher order allpasses, you can get a much wider range of effects. for example, if you combine 2x 1st order APs with opposite pole locations, you get a notch filter, and this is a common way to implement phaser-style effects in the digital domain.


OK, but you are mainly interested in these things as building blocks in reverbs, and particularly in classic Feedback Delay Network - style reverbs. here, the allpass has two nice qualities:

  • if used as a delay tuning component, since it has flat magnitude response, you can accurately compute the loss function as the eigenvalue of the feedback matrix. this means you can have a fully-lossless reverb network (“freeze” effect) without blowing up, as you might if you tried to use a different kind of sub-sample delay tuning method (e.g. polynomial interpolation, which as you’ve observed has a mag response resembling a comb filter - in fact an aliased sinc function.)

  • as a phase-decorrelation element. (if i recall correctly,) Schroeder’s reverb uses some matrix algebra to refactor a full NxN tuned delay network into an integral delay network, a summing term, and a bank of parallel APs for phase decorrelation. once you do this, you can actually choose different stuff for the phase decorrelation part, including filters with non-flat mag response (for coloration) and even nonlinear elements…

(TODO: demonstrate Schroeder reverb and variants in Faust, using different decorrelation and coloring filters / saturators!)

resonators: karplus-strong, scattering junctions

similarly, the flat response of the AP makes it useful for other algorithms where lossless tuned feedback is a potential configuration. there are tradeoffs: it’s not as straightforward to modulate the delay time without weird artifacts. and since the “delay time” is not frequency-dependent, it doesn’t preserve high-bandwidth transients and stuff in the excitation signal.

(TODO demo!)


Between you and @randy, I’m deeply grateful for this wealth of highly understandable information! This is really excellent - having had an education which went into the deep end of coding schemes and IIR/FIR filter transfer functions, but didn’t explain hardly at all how to use them, code them, or what to really do with them, this is amazingly useful stuff. Major thanks to both of you.

As a note: I suppose I’m interested in FDN reverbs because that’s all I’m aware of currently. I love the sound of vintage Eventide and Lexicon 'verbs (and some of the Alesis stuff too) so I’ve got a strong bias towards especially the lush-but-unnatural 'verb sounds from highly modulated delay networks going crazy. One of the things I’m still trying to wrap my head around are the reverbs such as Blackhole with resonators (and some black matrix magic) instead of delays at the core, but that’s for another post and another conversation. And the term “eigenvalue” - I still have no real comprehension of what the heck one is or why they’re useful despite having read countless explanations and definitions for it. It seems to be “a number that tells you some various qualities of a matrix, if you know how to interpret it” and that’s about all I get! :smile:

So, to move slightly forward from what you’ve just generously shared, when you discuss “the feedback matrix” - I know you’re speaking of the specific algorithm’s matrix (as in the discrete time implementation diagram you show), but is there a way to do this with matrix math (especially if I’m writing a lot of APs) and is it common / faster? Are there tricks to chaining multiples of these things together that can take advantage of this? The ARM cores I’m working with don’t have NEON, but I’m trying to structure my algorithms so that if they did run on a core that could use SIMD type instructions they could take advantage of it, and matrix manipulation is not terribly difficult to accelerate with SIMD.

As in one has A for the pole and one has -A? Or do we need to go into complex math at this point?

1 Like

Eigenthingies of a transformation are just the thingies that, when put through the transformation, only change by a constant factor (possibly 1). So exponentials exp(k*t) are eigenfunctions of the derivative operator d/dt, and eigenvectors of a matrix are those vectors that only have their magnitude changed, not their angle, when multiplied by the matrix. An eigen_value_ for a particular eigenvector is just the corresponding scalar that the eigenvector gets scaled by when transformed.

I like the eigenfunction examples because they make it clearer to me why eigenvectors are important: the exponential map is deeply related to the derivative operator, it characterizes the behavior of the derivative in many ways. Eigenvalues of a linear transformation are kind of like roots of a polynomial (well, they also are literally roots of a special polynomial), they are invariants that characterize the map, they describe its “shape” in sort of a fundamental way that uniquely describe the map (up to some isomorphism we don’t really care about, like a constant factor).

This gives them lots and lots of applications depending on what the linear transformation you’re dealing with “means”. In control theory they are a useful way of determining stability of a feedback system, namely, eigenvalues with positive real parts indicate instability (the more positive the real part, the faster it blows up; the more negative, the faster the system converges). This sounds kind of related to determining how fast your reverb decays, but my DSP is somehow even rustier than my linear control theory. edit: oh, okay yeah. This mentions that a state-space formulation is a special case of feedback delay network.


i think here we are sayin approximately the same thing. but this discussion is definitely making me want to play around with some ideas in that direction.

we do. by ‘location’ i mean the angle of the pole in the z-plane. by ‘opposite’ i mean reflected through the imaginary axis.

another way to put it: a delay line is some number N of single sample delays in series. with feedforward around the whole thing you get a comb filter. if you modulate the length of the delay line you get a flanger.

now if you replace each single-sample delay element by an allpass element (which is like a single-sample frequency-dependent delay) then whenever subsequent phase shifts sum to 180 for a given freq, you get a notch at that freq. when you modulate individual nodes in this AP chain you get a phaser. (pole location for notch freq, pole radius for notch depth.)

[aha! the usual suspects have written this up nicely: ]

allpass chains like this can actually be used in all kinds of crazy ways to warp the frequency/phase response of linear systems (including the discrete fourier transform…!)

ehh, i dunno… surely there are. but honestly i hope to never hand-optimize SIMD again. these days its possible to tell a code generator to factor a bunch of matrix maths, and a compiler to vectorize the resulting code. sometimes this beats hand-vectorizing and sometimes not. profile first…?

(actually i would love to know / learn any particularly good optimization tricks for AP chains. they come up a lot.)

oh hey, here’s that interpolation cheat-sheet i mentioned


Since we are touching on feedback matrices, I thought I’d share these questions / ideas I’ve had around the topic. While rebuilding the Erbe-Verb from Tom’s paper, I’ve stumbled upon feedback matrices, and I have been a bit puzzled by them. I couldn’t really find the explanations I was looking for about their design and behaviour online, so I played around with math until I got an answer that seems coherent to me. I’m not a DSP guru, so please let me know if this is nonsense.

From what I understand, a NxN feedback matrix has unity gain if the energy of the input N-dimensional signal is equal to the energy of the output N-dimensional signal — in geometric terms, the squared length of the N-dimensional input signal vector is equal to the squared length of the N-dimensional output signal vector. A matrix that has that property of preserving the length of any vector it is multiplied with must scale space in a uniform way, meaning the new basis vectors (rows of the matrix) are orthogonal and have equal length, and that scaling must have a magnitude of one, meaning the length of each new basis vector is also one. Essentially it is a matrix that only rotates space.

There are infinitely many matrices that have that property. Two extremes cases seem to be the Hadamard matrix and the Identity matrix (which you would not usually want to use in a reverb), where one completely intertwines the parallel signal paths of the reverb without having a tendency to concentrate the energy in a particular signal path, and the other completely isolates the signal paths from each other. What I’ve been wondering is whether it would be interesting to continuously morph through matrices that are somewhere between intertwined and parallel. I guess this could be done with some N-dimensional generalization of quaternions for interpolation. This may be interesting for longer delays that are tempo synced.

i’m no guru either really but that sounds like a good and simple description

and yeah, this

sounds cool!

but this

sound sticky. i think “N-dimensional generalization of quaternion” is called a rotor but i’d have to read up from there, and see if such things preserve eigenvalue in the way you’d want for a FDN.

Good to know it makes sense to you!

The only thing I know about rotors is that the exist and are supposed to be good at interpolating between rotations, so I still have some homework to do. Hopefully I can dig into the topic during the holidays and maybe get a prototype working. If I get anything interesting out of it I’ll share it!

What I discovered in my tests is that the choice of number of samples for the delay times (mutual primes for the dispersal you’ve mentioned) - that’s hard to nail for a general case. What sounds good on one material won’t sound good on another. Also, there are Reaktor ensembles that implement these basic designs (Schroeder etc), and the reason the ones that sound good sound good seems to be that they use more nodes, not just the 3-4 suggested by the documents. The choice of topology is almost entirely up to you. The results I myself liked best were designs “by feel” mixing nodes in series and in parallel, the points between which signal was being fed back, and whatever filters I decided to insert in those paths. That said, whole numbers of samples seemed to suffice for initial exploration.

Filters are baffling.

Frequencies are measured over time, but DSP code needs to make decisions one sample at a time. When that hit me, yikes. I dug through different open source filter implementations and realized this is why I should have taken calculus. sigh.

BUT, taking the c code example of the simplest possible LPF from JOS and translating it into Lua that loads into the VCV Rack prototype module [see footnote] felt like a step forward. I can understand the code, and I can see it quieting frequencies above some threshold, but…wtf. How does adding the previous sample to the current sample filter things? I really don’t understand why that would work.

config.frameDivider = 1
config.bufferSize = 16
last_sig = 0

function process(block)
	for i=1,block.bufferSize do
		sig = block.inputs[1][i]
		block.outputs[1][i] = (sig + last_sig) / 2.0
		last_sig = sig

However, the obsession flame has been sparked, I’ve spent a silly amount of time working through code. One of these days it will click, but…not today. Not yet.

All that said, “digital filters” were the thing that really a) made me want to dig more into DSP and b) want to pick up where I left off mathwise. Ahem. Carry on math friends, I really want to understand what it is you all are talking about above. :popcorn:


side note about the vcvrack prototype module, it’s really approachable, BUT - you quickly realize how slow traditional scripting environments are compared to lower level C/C++. Expensive operations that are fine when you making web are terrible when you’re running a chunk of code 44,100 times per second.

An example might help: take the signal …+1,-1,+1,-1,+1,-1,… and add each sample with the last one. You get …0,0,0,0… The first signal has the highest possible discrete time frequency, which gets annihilated by this simple (and not very good) lowpass filter. The process works the same for other signals too: averaging samples tends to attenuate high frequencies since the positive parts cancel out the negative parts.


I’m still trying to figure out how to efficiently reparameterize a 4-pole filter dynamically. Those little buggers have some modest math in their parameters and little embedded systems already have a lot of work to do in a very limited time. I’ve not fully worked out my synth architecture but I’m leaning towards single-sample processing without internal buffering/vectorizing (the chip doesn’t have SIMD/NEON anyway) and that keeps me from having to drag large sample buffers around with each component. I’ve only got 32kB of RAM to work with on this chip. But that doesn’t leave much room for a control loop to go around updating heavier values. Tradeoffs! It’s the name of the embedded game…

1 Like

Ok so I think this is the right place to ask about Hilbert Transformers?
Full disclosure I sometimes load a random object in Max just to see what it does, and I came across hilbert~. The help file is decidedly unhelpful for learning about what they do, and as usual every website I look at is full of formulas but has little to no explanation about how it applies to DSP. It looks like maybe an allpass filter? Any insight is appreciated!