Wow this is long… sorry! It’s been something I’ve been thinking about in another context and have been meaning to explore. This was the right opportunity!
I’m wondering if all of these are solutions in search of a problem… or at very least, a problem that our domain - musicians building music equipment for solo or small ensemble performance - does not have.
Sure - if you are building a 96 channel studio, and want to pipe 96 channels around between several endpoints - and want to do that on Ethernet with sample level sync… then sure: 24 bits × 48kHz × 96 channels × 4 endpoints ≈ 500Mb/s. That’s asking a lot from both GigE ethernet and software stacks.
BUT - I imagine that for the kinds of uses the people in this community might put audio over ethernet to, something much simpler is in order:
Audio over Ethernet as point-to-point connection: I’m thinking connecting computer to interface w/4 ch out, or say stereo in & out to an effects device. 16 bits × 48kHz × 4 channels ≈ 3Mb/s, – or call it 9MB/s if you want to double your sampling rate and more bits. This seems to me well within the realm of commodity interfaces and the standard network stack, even striving for, say 1ms latency.
Small studio of devices connected over Ethernet via a standard switch: Okay, so thinking three audio sources (synths) stereo out, three effects units, stereo in/out, analog modular interface: 4 each in/out, and audio interface, 4 each in/out. That’s still only 34 channels total. So - say - 24 bits × 48kHz × 34 channels ≈ 40Mb/s. This is probably still not straining the standard consumer GigE hub - and the software stacks at each device are still good.
I’m assuming here that what we’d more likely want from audio over Ethernet is ease and flexibility of digital audio interconnections between our devices. I don’t think we need to keep all these things in sample sync - 1ms or better latency (in to out, separate from any processing latency the unit has itself) - is probably fine.
Now - I haven’t done all the calcs of overhead and congestion here - but my day gig is architect of enterprise network systems (!) - so I don’t think I’m that far off.
The key thing to think about is “What is the use case?” AVB looks to me developed for the use cases of the film and broadcast video studio - probably with a bit of over engineering - because those users will prefer to spend upfront for systems with more specs than they need now (retrofitting studios is expensive).
I think we we might have a different set of needs…