yes, but coming up with a mipmap format is not really an issue. we have different requirements than Reaper, which produces mipmaps of entire media files on import or record. the issue is deciding when to build the waveforms. (or mipmaps, though i’m not convinced they are needed or even helpful in the norns context - requiring interpolation for drawing as well as for mapping.)
have given this a good amount of thought.
softcut client has an offline buffer processing class, which would be a natural place to render waveform data. my current thinking is that these renders should happen by request over arbitrary regions, without any attempt to “optimize” re-rendering. thus, we need not place any more computational burden on the audio thread; but since computation is still in the jack-client process, we don’t need to share the buffer memory directly with the client (which could also be a performance issue.)
passing the data back to the client is not really a problem. we already have a data pipeline for 128-byte packets from crone to lua. that’s as many pixels as can fit horizontally on the norns screen, and more than enough resolution per X position. (exactly enough, if we pack signed 4-bit min/max values.)
another consideration is what tradeoffs to make in the peak computation itself, for efficiency, vs. accuracy, vs. the possibility of missing peaks altogether. reaper is a DAW and its peak rendering is designed for offline computations on fast machines - it finds the true peak over each interval. we are dealing with a more constrained environment, and our peaks can be lower resolution. &c.
it’s possible for the softcut processing classes to keep track of dirty regions, but i’m not sure it’s realistically feasible.
more explicitly, i’d propose:
- client requests rendering waveform with N points, of region [a, b] in seconds.
- (we can assume N = 128, horizontal resolution of norns screen)
- samplerate sr
- duration d = b - a
- audio buffer is discrete-time signal x = {x[n]}, n \in [a * sr, b * sr]
- waveform is another signal y = {y[m]}, m \in [1, N]
- then y[m] = max(\{ x[w]: w = sr * (a + (m-1) * d/N), ... , sr*(a + m * d/N) \})
the main little issue being the execution time to compute the true peak depends on d. (how many terms in ellipses above.) that’s why precomputing mipmaps is helpful in the first place - each lower resolution can use the peaks from the next-higher resolution. but i’m not really seeing a way to do this that is suitably dynamic and performant.
anyways, i’m happy to provide the functionality described above, and leave it up to the script to set performance-appropriate limits on usage. we could even have the peak-finding index w advance by some stride that is >1 for long regions; a crude downsampling which will certainly add error to peak estimation, but i’d guess not significantly so if the SR divisor is kept fairly small.