well, this is interesting. so i have:

  • only ansible is connected to tt
  • M is running at 10ms
  • L 5 8 : CV I SUB 16000 CV I
  • L 5 8 : TR.PULSE I
  • clock out from Orca into IN1 with Orca running at max speed
  • clock out from MP into IN2 with MP running at max speed

it still locks. if i reset TT and then try doing a single read or a single write to ansible it doesn’t respond and TT locks immediately.

1 Like

@scanner_darkly - with your script and my TT + Ansible + TXo configuration, the Ansible stopped updating after a few minutes but the TT remained responsive. As before, I could poll the Ansible for CV values and it still dutifully sent them back. I would also accept and return updates over II but the LED and voltage out remained locked.


I validated that the i2c initialization for the two expanders sets I2C_PULLUP_EXT. So, that shouldn’t be a problem.

Based on a point up above about the TT’s i2c clock rate, I’m wondering if I should force-set the expanders to 400kHz. The max rate the Teensy i2c Library supports at the expander’s clock speed is 3.0M; they are currently set to 2.4M which, ostensibly, gets negotiated down by the reality of the bus.


I should reiterate the point that @tehn made above out for those that are following along - we are doing absolute torture tests to more quickly identify break points and potential issues. This isn’t real-life stuff.

What the Teletype has been capable of in these tests is seriously impressive - a testament to the well-designed hardware and software. It has only taken relatively small optimizations to unlock a massive amount of potential in the module and its growing ecosystem: the Trilogy modules, Ansible, JustFriends, Orca, and the TELEX expanders. Given the talent in the community - this feels like only the beginning for one of the most powerful modules in Eurorack!

3 Likes

i noticed too that in some cases the LEDs on ansible stop updating but TT is not locked and i can execute CV reads and writes without problems. didn’t check the actual CV output.

my plan was to check with the picoscope next to see what happens when it locks, managed to lock it almost right away and verified the scope was capturing everything properly, got ready for the actual run and, of course, now i can’t get it to lock :joy:

2 Likes

managed to lock it a couple of more times, but not much new info from decoding the i2c bus. everything looks totally fine and then it just stops. this is likely due to how i have the picoscope configured, it captures a buffer after the i2c start (falling edge on the SDA line), so it’s possible that the issue is that the SDA is getting pulled down and then nothing happens after that. edit: reproduced again and this time checked SDA after the lock and indeed it was low.

i’ll try to find a way to capture what happens but the problem is that it’s getting very difficult to get it to lock in the first place - which is good news really!

1 Like

I know I don’t have any clue what you all are doing there but I appreciate the activity on this very much. The above mentioned part about Ansible code reminds me on something that irritated me when trying the new firmware (and the one before from last week):

After I had installed the upgrade Ansible alone worked okay (apart from the input voltage bug) but after a few crashes the voltages cycles emits are not continuous anymore but stepped. My very limited knowledge now deduces that the Ansible code can change when it locks up and a power cycle is necessary and I wonder if this is to consider while trouble shooting.

i feel that you’re experiencing issues i’m completely unable to replicate-- e-mail info@monome.org and i’ll get you a replacement

I disconnected Ansible from i2c and gave Earthsea another try with the new firmware. It does not seem to skip commands anymore as it did before when clocking from teletype but on intervals of few minutes it slows down for a few seconds as I did experience before.

And when switching the grid between Earthsea and Meadophysics via Switch or hot-plugging there is a short pause when switching back to Earthsea and after some back and forth switching Meadowphysics does not recognize the grid anymore nor reacts to the speed knob but plays on. Still switching the grid to Meadowphysics in this state speeds its tempo up a little bit…:neutral_face:

Looks like this (steady medowphysics trigger to teletype with II ES.CLOCK 1):

EDIT: The lock-ups on grid switching happen on earthsea too - atm (which is since I do not clock it form teletype anymore) more often then on Meadowphysics.

my first test with TT & ansbile i2c com started well but unfortunately it’s totally locked up now.

script (from memory):

1:
CY.RES 1

2:
CY.RES 2

3:
CY.RES 3

4:
CY.RES 4

5:
CY.REV RAND 4

(6-8 unused)

I:
M 10

M:
P.N 0
X RSH CY.POS 1 5
P.I X

the above script seemed to be running very smoothly and without any noticeable problems for about 10 minutes. then i got excited and added 3 more lines to M:

P.N 1
Y RSH CY.POS 2 5
P.I Y

basically adding a read of cycles position 2 and using that value to update pattern 2’s index.

TT locked up real quick and became totally unresponsive. i rebooted a few times with no luck. in fact, the screen remained completely black (eek!). i pulled the i2c cable and rebooted. nothing. then i pulled all the incoming triggers (from meadowphysics) & rebooted. it came back to life. now i can use the keyboard to open other TT scenes, but if i try to open the scene shown above it will crash as soon as it receives any input - even hitting TAB on the keyboard.

/// FOLLOW-UP ///

I just tried to replicate this scene from scratch and i was able to freeze up TT with the same exact script without adding the last 3 lines to M. it seemed to be working fine at first, but i cranked up the speed of cycle 1 and it froze.

Tried again with a slower metro - 100ms - and it froze up after about 60 seconds.

/// ROUND 3 ///
another attempt, this time with a M of 250ms. all ran well for about 10 mins and then TT locked.

i’m running your script at 10ms metro, all cables patched, with all 6 lines in the metro. no crashes, running 5 mins so far.

did you update ansible as well as tt?

also-- pretty crazy scrubbing the tracker with arc knobs–

update

tt crashed after 15 minutes. wasn’t a stack overflow. ansible continued running fine. ok then.

yeah! is this asking too much of TT?

another discovery:

i was sending random position changes to random cycles on a script:

CY.POS RRAND 1 4 RAND 255

and although it worked wonderfully, i observed random leds lighting for a split-second on the arc rings. ansible cv out wasn’t affected by these glitches so it seems to be purely visual.

ok-- back to reality check on the i2c transmission times.

8 TR.TIME
8 TR.PULSE
8 CV reads
8 CV writes

= 5.8ms transmission time. (that is a long time).

this runs reliably at 20ms metro time. but, the 20ms interval of the metro timer gets totally hosed. off by almost 5ms.

10ms is not reliable.

so, time for some hard realities, i think. with a 400khz bottleneck for the avr32, using high-ish speed polling is not a practical application. there simply isn’t enough processor overhead to get things done.

i’m going to think about this, and i hope you guys can think along these lines also-- let’s try for a logical assessment of what’s possible.

1 Like

this is a visual defect bug, i’ll track it down when i have a spare second (and feel like the functional stuff is ironed out). previously reported by @Leverkusen

may be too much, polled at 10ms. but of course, polling slower than this will introduce quantization. we’re coming up against a technical constraint, so it’s time to choose musical constraints that fit within the technically possible.


i’m not certain, also, that we can completely crash-proof tt. certain scripts are going to be possible which will certainly take the system down. ie, L 0 V 10 : CV 0 I takes almost a full second to execute. stick that in a metro with a time shorter than 1000ms and bonk

short of revising the event system and tracking down every crash scenario, we might need to just accept that user error (or overconfidence) might break tt in its current hardware implementation.

I haven’t really been keeping up with this thread, but… if there is always going to be a danger of locking up the TT would it be wise to enable a watchdog timer of some sorts (I’m assuming one is available)? That way the TT just reboots rather than forcing a power cycle of the whole case.

4 Likes

yeah, i’ve been thinking that even if/when we resolve the i2c issues the next big one will be timing. doing things syncronously helps to fix weird i2c issues but at the cost of timing accuracy. this is unavoidable unless we have a dedicated CPU for each task, but we don’t, so we need to consider this constraint (and this is where i’d like to reiterate again that we are speaking about more extreme use cases, and TT is already incredibly capable as it is).

with that some things to consider:

  • timing will never be 100% perfect but we could try and get it to be as close as possible
  • what is more important? tigher timing or not skipping commands? if M execution takes longer than the metro interval, or if a trigger comes in before a script finished executing, should we skip the rest of the script or skip the trigger? perhaps this could be configurable. and if you choose the option of not executing scripts fully, and a script is cut short we could indicate it on both live and script pages with an icon of some sort, so then you’ll know when you are hitting the limits and can adjust your scripts or your trigger/metro rate.
  • we could also limit the number of commands per script but this feels like the least ideal option as in order to make limits work they’d need to account for worst case scenario, and it feels like an unnecessary limitation.
  • regardless of the implementation i think we should try and get it to the point where it’s able to recover gracefully.

in regards to that last point i’ve been reading up on i2c recovery and looks like there are drivers that account for a possible i2c lock (which by descriptions sound very similar to what we see, SDA is pulled down and is not released likely because the slave is waiting for SCL) and have mechansims to recover from that, usually by implementing a timeout and getting the bus into a healthy state again. ASF implementation does not seem to have that. we could implement something similar but this means tinkering with the ASF code, and i don’t know if we want to open that can of worms…

1 Like

honestly i don’t need to scrub the patterns that fast for my musical needs. most often i’m using the the pattens for quantization. if i want un-quantized scrubbing i’ll use the ansible cv outs. that said i was still able to lock up TT polling at 250ms.

@scanner_darkly - I was able to put a timeout in the infinite wait loop in the ASF library to bail out of unanswered commands. I was not able to figure out the series of voodoo commands required to restore the bus to a happy state. I tried a number of things there but was unable to get success. I’m just not familiar enough with i2c and with the ASF library to see the answer.

I do think we need to implement something like that to keep users from mis-typing a command and losing their work. With no Ansible connected, hitting CV 5 will lock your TT and you will loose all of your unsaved work. This is a pain in the butt for a studio session and a nightmare if you are live coding in a performance.

An easy way around this would be to “register” bus devices in the TT. The TT stores that registration to flash memory and then bounds checks any addresses at the time of the command - skipping devices that are unknown. Registration and deregistration would be a one-time affair.

Even if we do that type of thing, the aforementioned timeout and recovery would also be a good idea for the ecosystem so that it is more resilient.

Around timing, I’ve been able to pummel the bus with write commands without problem. 24 Triggers and 24 CV outputs writing values at 10ms intervals with only slight timing discrepancies. I put that in the “wow” category for what the TT and its 400mHz bus can do.

The reads are another story. They are slow and eat up bus time. I’ve avoided supporting read operations for the TXo for this reason; you don’t get a value from TO.CV 1 as it doesn’t bother calling to the device to get its state. I always figured that if this was desired, it would be far easier to cache the value on the TT and simply return it from there.

For the TXi, reads are unavoidable. One level of caching would be to remember values for the duration of a script so that subsequent calls for the same parameter would return from memory as opposed to another bus read. A second level of caching would have us move reads to a lower priority background thread and have it poll at a safe and reasonable rate (and for a single device at a time as opposed to input by input). Scripts can then read at any rate they want without causing havoc on the i2c bus and the values can update in the TT without affecting timing.

Lots of options here - thoughts?

this just isn’t happening for me. i can have a standalone TT, no cables attached, and attempt to read/write via i2c and it aborts just fine?

do they work in non-metro territory? do reads work to ansible correctly on your setup? (mine do, at this point).

this feels like it falls into the bad-user-script category. i think any caching would make operations more murky-- meaning you’d potentially be unsure if you were getting a live or cached value-- at the expense of trying to prevent bad user scripts.

having a constant-poll system isn’t a bad idea, but it requires management as to what values the user “cares” about. so you’d need some sort of INPUT.ACT status bits, like metro being on or off. i feel that this should also simply be managed by the user. particularly once the v2 feature AUTOTRIG gets implemented, where we have a timer per input channel, hence are less bounded by the existence of only one metro for user processes.


so at this point, is reading working at all for you?

this just isn’t happening for me. i can have a standalone TT, no cables attached, and attempt to read/write via i2c and it aborts just fine?

This is a given for me. Try to send to an II device that isn’t connected, the TT locks up. This has happened for me with my expanders and with the Ansible. If you disconnect a device and forget that the current preset has calls to that device, you have to reconnect the device or re-flash your TT in order to get back in.

do they work in non-metro territory? do reads work to ansible correctly on your setup? (mine do, at this point).

Reads now work both in Metro and non-Metro environments at this point if I have 3 or fewer devices connected. More than that and they become unreliable - locking up the TT after an indeterminate amount of executions. I’m planning to experiment more with this in the evening as the additional devices are my expanders. (I’m going to hard-set their i2c rate to 400 and see if that helps; I had been letting them autonegotiate down.)

IN and PARAM polling / caching

Yeah - I’ve struggled with this conceptually quite a bit as I try to think around my read difficuties. What is happening now is clean and clear and nice. Completely under the control of the user writing the script. I don’t think we would need to go to the extent of caching and polling unless we just don’t get the performance we want when reading from multiple sources at a reasonable rate.

1 Like

re: i2c - i saw the sequence for getting the bus back into healthy state somewhere, i’ll post if i find it. i think addressing a non existing device is a separate issue and should be easy to fix (since in this case no ACK coming from a slave whatsoever, vs a slave keeping SDA down).

pretty sure i was able to get TT to lock when addressing a non existent CV too. also when trying CY.RES i managed to get TT to lock immediately - i think this was because ansible was in the TT mode and not Cycles, but not sure 100% since i only had time for a very quick test. i’ll check again when i get a chance.

so what about the idea of interrupting a script if it gets triggered again before it finishes executing, and indicating it with a flashing icon or something?

1 Like

fyi, ansible changes i2c addresses based on what mode it’s in. so these are potentially the same issue.

that’s probably a rational thing to check out, good call

1 Like