Teletype possible bug: delaying triggers over i2c

My main combo is ER-301 sequenced from teletype over i2c. I have a problem that I hope can be worked out, although I realize it might be tricky, hence this lengthy post in new thread.

After a fresh reboot everything is working just fine. However, after “some time” (about 30mins or more) some triggers over i2c starts to get delayed APPEARING EXACTLY ON THE NEXT M CYCLE, sometimes even missed. Rebooting both modules makes everything fine again. The problem has started while I’ve been working on a new track, never had it (or noticed it) in the past. Although it’s a bit busy over i2c, it’s not more so than what I usually do. In this particular scene, triggers are send every 130 ms (16th notes at ~115 BPM), obviously at some points more than one trigger and CV is send at the same time (or as “at the same time” as is possible over what I believe to be a serial protocol). CPU usage on the ER-301 is comfortably hovering around 50%. ER-301 is running firmware 0.4.20. Between the problem appearing there have been several saves on both the teletype and the ER-301. I have not found any way (except rebooting) to solve the problem.

My i2c setup is teletype (newest hardware revision) with ER-301 and one txo. I never experienced this with physical triggers, and I’m pretty sure it hasn’t happened before I got the txo in the mix. We briefly discussed this on the OD forum, and seems other users have experienced the same, and from what I can tell they all have more than tt + 301 on the i2c bus.

One could argue that the teletype is simply overloaded, but it just seems odd that it’s running happily for quite some time before the problem appears, after this point it’s very pronounced.

A few observations:

  1. Shuffling scripts around sometimes makes it happen to other triggers.
  2. There’s no way (that I know of) to tell how stressed out the teletype is, so maybe it simply can’t keep up.
  3. The delayed triggers are always on time, just a 16th note (130ms) late, super tight, just delayed one teletype.metro cycle.
  4. The delayed triggers appear visually on the ER-301 GUI exactly as I hear them.

I hope we can discuss the problem here and hopefully get to the bottom of it, since it’s obviously quite annoying :slight_smile:

Here is the scene in question: pessimism_is_overrated.txt (2.0 KB)

1 Like

I can only add that I’ve seen disappearing triggers from Teletype with Just Type (no TXo, no busboard, ER-301 is in the chain), especially in Geode mode. Sometimes micro delays on a few lines of the script helped but in a nonpredictable way.

Same problem here…
(copy/paste post from ER-301 forumlink)

I have er-301 teletype (old one green pcb)16n and txi all connect with i2c on a tt busboard
I always had some random step jump
I use SC.TR.P x for triggering a kick but sometimes appear a step jump or beat eating …

I try in other cases other power brick with an lipo battery
I change i2c cables try differents configuration with just teletype and er-301 and i always have the same problem

this is a link with 2 videos the quick save and the TT script.

I always have a little delay (you can see the little delay on the red led flashing in 4th note) and some random step jump on Kick/Clap soundstrigger delay and step jump

I try with Firmware 0.3.25 & 0.4.2

What do you think @scanner_darkly and @tehn? Any suggestions on how to move forward?

so if i understand correctly, the issue comes down to some i2c commands being dropped? possible reasons could include:

  • teletype not sending it - this would be easy to test by also outputting a trigger on one of the TT outputs. of course there could be conditions which would result in a script not getting executed at all or being delayed, but doesn’t sound like it’s the case here.

  • i2c not being 100% reliable - more likely, and unfortunately not easily remedied. there are several variables that contribute to this: number of devices connected, the length of i2c cables, the overall pullup resistors value etc etc. not easy to investigate either unless you have an oscilloscope. the only thing to try is eliminating as many variables as possible - put er-301 and teletype into a case with nothing else, connect with the shortest i2c cable and see if you still get dropped commands.

  • er-301 being busy with something else when an i2c message arrives. a good way to test it would be having something like telexo or just friends also connected and send a trigger to both that and er-301 and also to one of the teletype trigger outputs and use it to generate another sound - and then monitor everything and see if only one of modules doesn’t receive a trigger.

if it’s reason #3 there isn’t much that can be done i’m afraid (but i’m not an expert on i2c so treat this as just an informed opinion). teletype doesn’t confirm whether a follower received a message. i2c protocol supports acknowledgement from followers at the end of broadcast but it’s only used for i2c bus management, not to confirm a message was received. supporting the latter is possible but would require a non trivial redesign of the i2c library. another option is to ask for a confirmation explicitly and implement a retry logic - this would require a fairly big change in both teletype and all of the followers, so likely not a feasible solution.

1 Like

Thanks @scanner_darkly for getting back on this!

Could you please take a look at this set of test files? Notice the video where TT is bonkers. It’s not even triggering on t == 0 as its supposed to. I agree that it can be a lot of things, but it looks to me like the TT is really, really confused in this state.

Test files

I have a Teletype forwarding the values of a faderbank to the 301 and one day I tried to also send a rhythmic pattern to the 301, that’s how I noticed a similar issue (fully described here Teletype workflow, basics, and questions )

Triggers sent to the 301 from Teletype become delayed after a few minutes, there’s some kind of jitter.

But in my case this seems to happen only if the Teletype is also reading the values of a Faderbank on the same bus at a rate of 25ms (internal metronome)

Do you think that reading 16 values at 25ms is too much for the Teletype ? If it’s not, then the 301 might be the “culprit”. The TT is right next to the 301 in my case, not more than 15cm of cable I think.

sorry, i’m not able to provide teletype support at this point, other than the areas i contributed to (such as grid integration). i can take a quick look but as mentioned, let’s eliminate as many variables as possible before we get to that.

could both of you try a very simple setup, just teletype sending triggers to er-301 and one of its physical trigger outputs, and see if you can replicate the issue?

@a773 i’m not able to see the videos. if you disable i2c lines does it still act weird?

and just to reiterate, i2c is not 100% reliable and we’ve been pushing it far beyond what was originally planned for teletype. something like reading 16 values every 25ms is definitely a pretty extreme use case.

1 Like

my armchair engineer’s 0.02c is to underline these points:

each byte is ACK’d, but there’s no timeout/retry mechanism except what is rolled in software for important configuration steps. it’s not feasible to make each transmission robust in a soft-realtime scenario. think UDP, not TCP.

if ER-301 is busy when a message comes in, or if its RX buffer is full or something (i don’t know how the event stack works on that device), then the message is gone.

there could also be more than one thing happening here.

  • dropped messages, but solid timing, sounds like the receiver missing it. (or, equivalently, TT i2x tx buffer is full b/c the bus is claimed too often for it to get rid of all pending tx’s.)

  • “jitter” sounds like too much traffic. the entire bus can only have one transaction on it at once. the maximum clock speed in “standard mode” is 100 kb/s; iirc this is what TT sets as master. but that is a maximum - it just means that compliant hardware has to be able to handle this clock rate in silicon. in practice, since i2c is an open-drain bus, it can get very slow with multiple things reading and writing from it - every device can “stretch” the clock by holding the line down for an arbitrary amount of time. usually there is some inductance and slew (that’s why cable runs should be short) - to accomodate this, devices will wait a moment after pulling the line down and then check the value by reading it.

it is totally possible to corrupt the bus entirely by exceeeding the throughput rate momentarily, and i suspect that is what’s happening with these long-uptime scenarios.

if you’re sending 16 values (16-bit values? dunno) every 0.25 ms (or 40/s) then that’s 10kbps right there. it sounds conservative, but with multiple devices and clock stretching it’s really not.

in a nutshell: i would be very conservative about realtime applications of i2c for musical sequencing and making dance music and stuff. we used i2c in the buchla 200e and quickly ran into these limitations. it’s (basically) fine for a midi keyboard limited by human input rates. it’s not fine for dense machine music. it’s fine for system / preset management stuff where there is the luxury of time to ACK/timeout/resend every piece of data.

finally: TT is a tiny old processor. the i2c tx buffer is peripheral driven, but the core loop still has to stuff bytes in it and handle interrupts when it’s done. so timing could be well mangled within TT by attempting to send too much.

recommended reading:


Sure, and thanks again for everything you’ve done for the Teletype :wink:

Yeah I understand, that’s seems to be a lot of data indeed. At the moment I’m only using Teletype as a bridge between Faderbank and 301, maybe I should only use the faderbank with its CV outs, this way I could use the TT again :slight_smile:

When messed up:

Fine after reboot:

Do you mean if I pull i2c cables or just don’t run lines with to or sc commands? See answer at the bottom (“I’m pretty sure…”), but of course I’ll run these tests first thing in the morning :slight_smile:

Who would be the best person to contact for help?

I agree, and understand. I will have to investigate further, but I’m pretty sure (from my experience and hearing others stories) that the problem is only there when reading and writing over i2c. Seems to me like the problem is that somehow the TT get’s itself in a state that it can’t recover from. Ok, we pushed it too hard, but if it would gracefully land on it’s feet again when the heat was off, I would be perfectly happy. The main problem is that the only way to recover is by reboot…

Sounds like the most plausible explanation to me!

Or by disabling the metronome in my case. Also, for reading the faderbank values I tried to trigger the script doing that with a Pamela New workout, I have better results but of course the PNW has to always run.

Once again, I think I’ll use the faderbank for its CV outputs only, even though that will use all the inputs of the 301 :-/ that seems more reasonable

Not sure this applies in my case. If you look at the first of my videos, you see me stop/start the metronome repeatedly without any effect. Or am I misunderstanding?

1 Like

to clarify this in case anybody wants to take a dig at making this change - a simple case IS trivial, but a proper implementation wouldn’t be. simple case: i2c_master_tx actually returns a status (TWI_SUCCESS if successful), which teletype doesn’t currently check. so a simple retry logic could be added here - but it would only deal with one scenario. also, i did some testing when working on polyES, and the problem is that this retry logic doesn’t seem to address the underlying issue - whatever causes it to fail in the first place is not likely to work when you try immediately after. so the retry logic would need to be implemented in a way where it doesn’t retry immediately - which would introduce timing issues…

disabling i2c commands (you can comment them out with alt-/)

that’s part of the problem - when we ran into i2c issues before one of the fixes applied was to give i2c highest priority (it will block timers / screen refresh / keyboard while i2c broadcast is in progress). so if there is anything that causes i2c to get blocked, it will cause tt to freeze. there is one simple fix that seemed to work okay for polyES, we could try it for teletype, i’ll post a test build tonight or tomorrow.

1 Like

I will watch your videos now, sorry maybe our cases are quite different after all, I don’t want to detail the thread. Reading 16 faders at 25ms is definitely the cause of the issue in my case :slight_smile:

That would be awesome, I’ll be delighted to try this out!

I’m pretty sure we’re seeing two sides of the same issue here… So no derailing AFAICC :slight_smile:

1 Like

here is the test build (it will display .3.1 I2C on startup):

teletype.hex (581.4 KB) (170.6 KB)

make sure to save your presets on USB before you flash, all presets will be erased!!

it has whatever the latest official firmware has plus the i2c fix. to clarify, this will not make i2c more reliable (as a matter of fact it can result in more dropped i2c commands) but it should prevent teletype from freezing if something goes wrong with i2c.

this is a simple protection from infinite loops if the i2c bus becomes locked/corrupted

bonus version (3.1 I2C+):

teletype.hex (581.5 KB) (170.7 KB)

this version will retry up to 3 times if i2c send fails - i doubt it will do much good based on my previous tests (whatever causes i2c to fail is not likely to get fixed if you try a couple more times immediately after) but give it a try, maybe it’ll improve reliability in your scenario.

1 Like

I can confirm that this patch fixes crashes I was encountering when running Ansible as an I2C leader at really high clock rates (like, tap-tempo as fast as I can -> 16x clock multiplier -> 16x clock multiplier -> Ansible). Now TXo crashes if I run this for a few seconds but if I crank the clock multipliers back down Ansible is still running fine. Teletype with another I2C op in the Metro script also crashed after a little while but the build of Teletype I’m running does not have this libavr32 patch.

1 Like

Thanks a lot, I will try the altered firmware later today!

Would it be possible to either automatically or manually recover from a “corrupted” i2c bus?