Updated and testing now. Thanks for pushing out a test firmware so quickly! I’ll report back if I get a crash. 30 minutes has been my threshold for calling it stable. If I make it past that, do you want me to run it until I experience the same Metro and keyboard bugs that you and @tehn are experiencing?

2 Likes

ok, i’m running the same test without i2c now.

2 Likes

1 hour 15 minutes in: no crash yet. This is great progress :smile:
I’ll report back when the Metro or keyboard bugs manifest themselves.

2 Likes

2.5 hours in. No keyboard or metro issues yet.

3 Likes

ok i got a non-i2c metro “crash”

again fixed with a M.ACT cycle

2 Likes

Alright, 4 hours without an issue. I’m going to call it stable on my end and stop wasting electricity =)

4 Likes

Excellent. I’m back in the midst of the school holidays now that the long Easter weekend is over. I should get an hour or 2 later on today to: merge the i2c fixes into my 2.0 branch, post a new beta with those changes and open up a PR.

Once the PR is merged we can have a look for the keyboard and metro bugs. I wasn’t expecting them to be duplicated, but once they were, my gut reaction is that it’s some kind of integer overflow bug. Possibly in the timer code itself, possibly in some refactor I’ve made. Just had a quick look at the code and hold_key_count also looks like it has the potential to overflow too.


@trickyflemming if you get a chance, do you want to try increasing the complexity of the script? Maybe use the Jumpy Edges script?

1 Like

i doubt hold_key_count is the cause, even if it did overflow it would just continue incrementing it. my feeling is that this has something to do with timers. i thought perhaps the timer generating events for handler_keyTimer becomes corrupted but it also processes the front button presses and sounds like it still responds to that when keyboard and metro crash.

there is a separate timer however used to generate events for handler_HidTimer, maybe it becomes corrupted somehow, and the metro timer as well?

i wonder if ticksRemain overflows when it’s decremented, i don’t see how it’s possible but who knows, maybe there is some weird race condition… one way to prevent it would be to check if it’s greater than ticks after this line:

@sam could you give this a try? if this is indeed what happens it would result in the timer set to a really long delay which would explain the behaviour…

i really like the idea of separating critical events into their own queue.

1 Like

… timers becoming corrupted would also explain why M.ACT fixes the issue

1 Like

Yeah, thinking about it, if we’ve got problems with both the metro script and the keyboard then the timers are used by both.

I’ll try and have a look this week. I think I’m going to want to clean the code up a little before I dive too deep (delete the comments, try and improve the readability a little).

One other thing, maybe change if(t->ticksRemain == 0) { to if(t->ticksRemain <= 0) {, just in case it does end up below 0. (edit: actually should probably check if ticksRemain is signed or not first!)

1 Like

unsigned, so something like if (t->ticksRemain == 0 || t->ticksRemain > t->ticks)

still feels like the probability of this happening is very low, especially for these two timers while all others seem fine, but wouldn’t hurt to protect timer code a bit. another option to confirm it’s timer related would be to restart both of these timers when the front button is pressed, and once it gets into this state try it and see if it helps.

I ran the Jumpy Edges + Metro script this morning for 1.5 hours without issue. It’s looking pretty stable on my end.

1 Like