Ive not had a chance to look at the patch… so perhaps a red-herring, but perhaps the cpu % are misleading?
the rPI is quad core, and a single pd process (without using pd~) is single threaded, so perhaps PD is reporting % one core (its using) , whilst your ‘system graph’ is reporting % of all cores .
(upshot PD can only get a max of ~ 25% of the quad cores/cpu, without spinning off other PD processes with pd~)
if you use top then you can tell it to report each core separately, which might help to see if PD is maxing out.
also… cpu% are averages, whereas you’ll get audio glitches even if you only have a few ‘cycles’ causing underruns on the audio thread.
what id probably try initially is to increase the audiobuf/delay size in audio settings, and seeing if that helps (and it should) , if it does, then you that might tell you that the patch has some inefficiencies.
finally, with something like a rPI, what I would recommend is, when developing (so using the GUI) , increase the audio buf size - but for ‘performance’ run PD with -nogui and reduce the buffer size and so get reduced latency.