The delay limitation is pretty arbitrary at 8.
The size per delay slot looks like it's about 100 bytes by itself, not really sure what other memory requirements it generates.
With 70kB RAM to spare, I don't see why the delay stack couldn't be much larger. It should be as easy as changing the constant and recompiling. There might be computing overhead on large delay stacks, however.
That said, your solution currently uses SCRIPT to perform a subroutine. The other options are: develop a macro for the PRE combination L + DEL or alter the parser to accept nested PREs.
TR.TRILL seems very specific. It's musically useful, and this is a musical computer, so it might make sense, but implementing it would come at a cost and only do one thing.