Oh man, what an epic response!
A lot of the maths goes comfortably over my head, but I’ll try to more fully unpack the process and context as that may be useful here.
So as mentioned above, the idea is to triangulate the position of a strike on a drum using TDOA from multiple sensors.
The jumping off point is this paper from NIME proceedings back in 2014. In their approach they use 6 optical sensors on a drum and then use TDOA to find the position of a strike.
The main/relevant bits are:
Where S1/S2 are the physical sensors, and the hyperbola P represents all the possible places where the strike could have happened given the delta values of time of arrival.
They then follow that through to solve for each pair of sensors (S1/S2, S2/S3, S3/S4, etc…) to arrive at:
This is the jumping off point.
So in my case I’m using completely different input sensors (a DIY/3d-printed thing I’m developing for SP-Tools, and only four of them.
That gives me a physical setup that looked like this at the time of capturing all my initial test data:
I did a load of testing moving in cardinal directions by fixed amounts etc… and that gives me audio that looks like this:
For the sake of simplicity (and for solving the harder parts first) I did manual/visual peak picking from this to arrive at delta offsets. In the final version this will either by cross-correlation or amplitude-based peak picking, but for now I’ve manually tagged deltas.
This particular set of offsets gives me:
0 80 125 82 (amount of samples, recorded at 44.1k).
At this point the math slightly departs from their paper as in testing, and reading through some of the sources with my friend (this hefty book) the way they are computing the speed of sound through the medium (
s) produces some wildly different results. From the paper they are computing
Which in our case would yield a speed of sound through the drum head as being 135.4m/s (
r = 0.1778m (i.e. 7") and
f = 248Hz. In our case, because I was thorough/detailed on how I was hitting the drum, we can actually compute
s based on real-world measurements:
So with this, we get an
s of 84.32m/s, which is much closer to simply computing
2rf, which produces 88.19m/s. This becomes important later on when it comes to the intersections.
s and having delta offsets (which are converted to seconds, to have consistent units), that produces a quadratic equation per pair of sensor points (S1/S2, as per their paper).
This takes us kind of to the orignal point in that when solving for two positions (S1 and S2 in my picture above, which I have labelled cardinally as (N)orth and (E)east) I end up with the equations:
0.000171(x)^2 - 0.00023(y - 101)^2 = 1
0.00054(y)^2 - 0.00012(x - 101)^2 = 1
(these are in mm/s rather than m/s as I think that will be a more useful final unit)
When these two are plotted that yields this:
This would produce (up to) four intersections in theory:
But given the initial delta values of
0 80 125 82, that means that the (valid) intersection of these two particular hyperbolas would be
(-76.8718, 94.2476) (top left quadrant intersection) as the sound arrived there first (the
0 in the list of deltas).
So the math/logic of this is that given the TOA of the wavefront at those two sensors (N and E) that
(-76.8718, 94.2476) is the point that the strike happened.
This is then repeated for each of the remaining pairs (ES, SW, WN):
Then the mean of each of those intersections produces
Which is really accurate as this was a strike 21mm diagonally in from point N on this plot.
I did have a mistake when I posted about 6 possible intersections as even though there are two more geometric intersections in this projection (red+green and blue+purple) those are incidental to the projection and those hyperbolas don’t actually intersect in “reality”. As in, I can compute North/South and West/East as additional hyperbolas, which would produce two more quadratic equations, but I have not (yet). Mainly because it’s not as straightforward to rotate/translate the hyperbolas onto this projection.
Now the reason I went on and on about the constant
s is if I follow the paper to the T and use
2PIrf/2.045 I instead get this projection:
In this case two of the hyperbolas don’t intersect at all (red+blue, green+purple) and the ones that do intersect would produce a strike that would be physically off the drum.
Phew, that was more than I was planning on typing out!
Was useful for me to write the whole process/thinking out though.
So yeah, that’s what’s going on, and why/how these equations are being arrived at. And for the moment, each set of points I’ve put into it returns a good/plausible result. The problem is trying to do this in “realtime” where a strike on the drum triggers an onset detection algorithm which then does a windowed cross-correlation or waits for 3 more peaks to happen in amplitude-land to produce some time deltas which are then shoved into all the maths after that, hopefully to return (“instantly”) an XY coordinate of where the strike happend on the drum.