Aleph refactoring and integration

because it has been so long, i’m just gonna start yet another thread drawing from
http://llllllll.co/t/monome-modular-firmware-on-aleph/
http://llllllll.co/t/github-reorganization
http://llllllll.co/t/module-aleph-code-sharing

i am finally doing more work on the aleph. sorry i’ve been so ineffective lately (2016 has been… not the coolest year ever.)

in the meantime, @rick_monster has done a huge amount of great work in his fork.

i’m finally getting around to finishing the migration of aleph codebase to use libavr32. as @sam kinda pointed out in one of those earlier threads, it’s not exactly obvious where to go from there, but even a start on sharing driver code and eliminating repetition will make me more comfortable.

after that, i really want to get serious about adding some of the many proposed and promised features to, DSP, control, offline tools, and developer ergonomics.

anyway, i want to have this thread as a place to ask questions / advice / clarifications. rick has a big net refactor inbound, and between that and the libavr32 migration things will get a little sticky. i’m butting heads a little with some of the factoring decisions made by @tehn in libavr32, and i’m hoping @sam will be able to give a little guidance on git details. so here we go…

11 Likes

first question, for @tehn

in libavr32 monome driver, it looks like arc functions have been removed, and grid functions have been renamed from e.g. monome_grid_led_set to monome_led_set. do none of the modules work with arc? or has the functionality just gone somewhere else? looking at the implementation, it seems like grid and arc really do have different bit-level protocols, so the same functions won’t work for both. am i missing something? can i just add the arc stuff back?


relatedly, for @sam:
let’s say i do need to change something in libavr32 that impacts module code. (i’ve actually made one such very minor change here, renaming the DAC chip select pin define from DAC_SPI back to to DAC_SPI_NPCS for consistency. this isn’t just cosmetic; aleph has two independent SPIs, so FOO_SPI kinda needs to refer to the SPI controller for FOO, leaving FOO_SPI_NPCS to explicitly indicate the chip select pin.)

i have my own forks of libavr32 and all the module repos. should i open a tracking PR for each repo, containing all changes arising from the aleph integration? or would you suggest a better way?

i ask because the workflow i came up with seems just a little clumsy:

  1. in my local aleph repo, make changes to aleph code and libavr32 submodule code
  2. push changes to my aleph and libavr32 forks
  3. in (say) my local teletype repo, pull changes from my libavr32 fork (which TT’s libavr32 submodule is configured to point at.)
  4. make changes to TT code and push to my TT fork.
  5. repeat 3 and 4 for other impacted modules.

and of course for upstream integration each repo should have a branch with the same name containing all these changes for a tracking PR.


finally a question for @rick_monster
do you think your net refactor is close to being PR-ready? i’m guessing it will make most sense to do that first, then deal with the lib refactor on top of it. that leaves the integration work on me, which is probably as it should be.

so I had two tries at fixing problems with bees.

The first was super-ambitious and basically amounts to a rewrite. branch is called net-refactor. I want to pick it up again, but it’s nowhere near pull-request ready and may not be salvageable. Don’t stall anything else for that - we will deal with a hairy/scary merge if and when…

The second attempt was more like ‘fix the usability problems with bees without tearing everything to bits’. That, I think is pretty pull-request ready, but I’m wary there may be some very occasional intermittent bugs or something. However this version has been installed on my aleph & I have been using it more than any other BEES version (I always got bored rebuilding things without arbitrary insertion/deletion features). The new functionality there was:

  • new memory management (can’t remember now if it uses two memory pools for big/small ops, or malloc/free & fragmentation be damned)
  • arbitrary op deletion/insertion

The above stuff should be on https://github.com/rick-monster/aleph/tree/net-use-malloc (need to check). Feeling bold, gonna open a PR right now for that branch…

I also added:

  • new ops like mem1D, mem2D
  • changing the grid & life ops based round x/y & status bangs

these last changes should go in a separate pull-request, since they may not be viewed as enhancements. Kind of gives you more rope to hang yourself creating BEES rube-goldberg contraptions, dunno if that’s a good thing or not! Well I had fun devising those ops anyway. think these are on the branch called grid-refactor with these changes. Let’s get the net-use-malloc merge out of the way first.

The BEES version I’ve been running/testing is https://github.com/rick-monster/aleph/commits/grid-refactor.merge.net-use-malloc IIRC. If I was being super-careful & organised it should be a clean merge of the 2 sets of changes…

1 Like

awesome, thanks, makes sense
looks like net-use-malloc uses the bigOpPool / smallOpPool approach, with those pools being heap-allocated once.
i’m gonna stare at this a bit more and try to wrap my head around the implications, maybe follow up with specifics on github

arc stuff is still there, i just switched orca to use libavr32 as a submodule and am able to build successfully with libavr32 aaf4b7075846de5c8758260adfb54d2ececc1122 and can confirm it works with both a grid and an arc. ansible also uses libavr32.

monome_led_set updates monomeLedBuffer and monome_grid_refresh or monome_arc_refresh writes it to the device (if i understand correctly)

which function to call is based on this:

and refreshFuncs is an array of the 2 function pointers

i just write directly to monomeLedBuffer and set monomeFrameDirty which i suppose should be faster when updating several leds at once.


good to hear this is happening!

thanks! sorry, i should have spotted it, there’s just ring_map now.

there might be a misconception at work here. monome_grid_let_set and monome_arc_led_set in the original lib are convenience functions; they update the LED buffer and set the appropriate frame-dirty bit, just like you’re doing in your app code, and the actual refresh is always a full frame using the led/map serial command (the led/set serial command is never used.) there should be no speed difference, and personally i think it’s cleaner to not require apps/operators to manipulate the low-level frame data.

so… unless there’s an objection, i’m gonna just re-add the wrapper for arcs as well as grids, so that aleph app code doesn’t have to be rewritten to twiddle the frame buffer for arcs.

that said, the frame/frame-dirty buffers are externally visible to app code for a reason. it’s so that an app, if desired, can maintain multiple copies those buffers, and switch states on the fly, or do other weird corner-case stuff. but normally this should not be necessary.

(just to clarify… for what it’s worth, i did author that code… and the header comments, which are worth reading :wink: i was just wondering if there’s a specific reason for the restructure, but it doesn’t seem so? still would like to hear from @tehn; maybe we’re both missing something - like my original implementation was broken because i don’t actually have an arc.)

Breaking libavr32 changes…

@zebra, yes it’s awkward. But on the flip side, it’s code used across 6 different products (7 if you count Orca), so a bit of awkward is to be expected. I think there are 2 ways to go about it.

The first is as you stated.

  1. Make libavr32 changes and push them to catfact/libavr32.
  2. Fix each module, best to keep the change atomic, so change the code and update the submodule commit (git add libavr32) in the same commit.
  3. Open a PR again monome/libavr32. Merge it.
  4. Open PRs against each of the modules.

One point, if you want to pull the libavr32 changes from aleph to teletype you don’t have to go via GitHub, you can pull and push across the filesystem instead.

The other option is to not update the modules, but instead defer those changes until such a time as a libavr32 update is required. At the moment there is no reason why (e.g.) White Whale needs a new version of libavr32, maybe it will when an update to the ii protocol comes along. It would probably be a nice courtesy to open up an issue in each module’s repo though.

Either way, I think adding a CHANGELOG.md file to libavr32 would be a very good idea, drawing a lot of attention to breaking changes, also best to include the word breaking in the commit message somewhere.

Finally, I’d humbly suggest that if you’re going to make a breaking change it would be best done in isolation from any other changes, you should be able to create a new branch from upstream master and chery-pick the commit.

Travis CI

I’m not sure if you’re aware but monome/libavr32 has a .travis.yml script set up that checks the master branch of each module against any changes made. If you log in to travis you can enable it on catfact/libavr32 too. Then you’ll get notifications of build failures in the other modules due to any changes made, you may or may not find this useful. We could add Aleph to the list of build checks once you’re ready? (I need to add Ansible too.)

asf

I’ve noticed you’ve made some changes in the asf folder, would it be possible to have a think if that’s the right thing to do. I wrote a script to generate the contents of that folder from Atmel’s zip file (and make any patches). The thinking was that it would be easy to update if a newer release of the ASF came along (or if we want to add any extra files). Now I admit it’s pretty unlikely that Atmel will update the AVR32 side of the ASF, but I think we should formally acknowledge if directly editing the contents of the asf folder is okay.

My feeling is that it would be better to update the diet-asf scripts and use that, but I have zero responsibility towards any of this, so I will defer to yourself and @tehn and try to be as helpful as I can however you want to go about it.

2 Likes

thanks for the feedback, very helpful indeeed

breaking libavr32 changes

it’s akward

didn’t mean to complain! just wondered.

push and pull across filesystem

ah, what a good idea. i’ve never done that; assume it’s accomplished by adding an additional “remote.”

defer changes

this crossed my mind, but it simply seems less confusing to keep everything looking the same as long as it’s straightforward, and having a tracking request open seems helpful for transparency.

i will follow your other suggestions for sure. (add changelog, rebase for atomicity)

travis

i saw that… there’s something about using a remote service for this that bothers me a bit - security is one concern when the service has filesystem access… and it seems like overkill for this simple use case. but the unit tests are nice… i guess it’s time to bite the bullet, sign up and join the modern age. i digress.

asf

right, thanks for reminding me that the asf is auto-generated. i agree, probably a good policy to make any truly neccessary changes in asf-diet scripts and never locally. in this case, can probably work around instead of making changes at all.

2 Likes

I only mentioned it as I can see a point in time when WW, ES and MP are no longer updated, unless it’s for ii compatibility.

Travis doesn’t have access to your local filesystem. The security issue discussed (I think) was more about an attacker getting hold of secrets that you’ve given to Travis (we’re not doing that).

There are some unit tests that @ngwese wrote, but also some acceptance tests (i.e. does any change to libavr32 break the compilation of a module’s code).

The main benefit of Travis is that it gives some assurances to others that you haven’t screwed up something that affects other modules, and you get that benefit when we make changes too. I also think it can be something that helps newcomers (assuming they know what CI is), it can be nice to know that any changes you’re proposing don’t unintentional screw things up.

Keeping the ASF folder unchanged might also help with potential future ports to other ASF supported CPUs (e.g. the ARM based SAMD), when I was writing the diet-asf script I got the distinct impression that all the 32bit targets had very similar libraries and APIs available.


Slightly OT, but I think I saw on another thread that the 200e used the AVR32 too? Did you end up writing some of that code?

1 Like

don’t want to harp on this tangent too much… but in the referenced issue, there was a vulnerability that allowed code injection in the YAML parser (maybe.) true, the main concern there was database access… but the same parser ultimately returns the shell script that you’re executing, right? it’s not a serious or specific concern, but i’m generally wary about executing something delivered by a remote service… that’s pulling in a big stack of RubyGems… &c. but yeah, not a big deal.

when I was writing the diet-asf script I got the distinct impression that all the 32bit targets had very similar libraries and APIs available.

good point… we should try to keep our ASF use as clean as possible, though i think e.g. migrating to SAM would still be pretty painful. i’m just having the damndest time getting the sd_mmc stuff to work “out of the box” right now. some kind of order-of-inclusion weirdness… it’ll get there.

200e

Yea, I did the 251e, some maintenance stuff for other modules, and different things here and there on the earlier 8051-based modules.

3 Likes

No, they’re executing it on their servers. Nothing runs on your computer.

1 Like

ohhhh… that’s much less freaky. kind of amazing that they have the server resources to do that for free. all this time i thought it was just acting as a wrapper for github web hooks. TIL!

this is very reasonable, i apologize for the trouble. i need to abstract all SPI routines out of the module code itself and have them live in the libavr32 only. i did this (i think) for ansible, so the others should be a fast update for me.

ok re grid/arc code:

i’m not sure what my justification was at the time-- beyond familiarity with manipulating the array given there’s such a clear connection between the array and the grid interface. is there really no speed difference when calling functions?

the other part was i wanted to be able to “build” a whole frame before setting a dirty flag, so the frame is never half-drawn.

i totally agree that the code is cleaner with wrappers.


as a sort of across-the-board apology for code decisions-- i’m definitely always learning as i go. unfortunately some decisions get made or sloppiness let slide due to being completely overstretched running a very small operation-- and that’s not an excuse-- it’s something i have to remind myself and figure out how to manage things better to allow for better code and engineering and production and documentation-- basically everything.

this is to say, thank you all for your generous contributions to making things sane and interesting.

4 Likes

check out the definition for monome_grid_led_set … this is a tiny function that just updates the frame buffer and frame-dirty flag for you. its just common sense to separate that level of bookkeeping from the application logic. i haven’t benchmarked or disassembled, but i can’t imagine the overhead of the function call is significant, and it may well be inlined away by gcc in any case.

because they just manipulate the frame data, the high-level accessor functions are protocol-agnostic. that’s why they’re not assigned in the protocol function table posted above - those functions are the ones dealing with the actual serial data, and they need to be assigned on the fly depending on what device type is detected.


i’m definitely learning too. that’s why i’m here, so sam can correct my dumbass assumptions about how CI services work! and so on.

1 Like

apologies - misinterpreted your question, didn’t mean to explain your own code to you! :slight_smile:

so you think the difference between calling the function in a loop where you update all 256 leds vs updating the buffer directly should be negligible? perhaps it’s just a matter of avoiding making “wasteful” updates like that but it does simplify things sometimes.

another reason to manipulate the dirty flag would be wanting to have a control on when the refresh happens, but this could also be handled by removing it from set functions and having a separate call i guess.

swapping buffers is not supported right now anyway, wouldn’t you have to update the existing functions to use a pointer instead of the buffer variable? actually, could abstract this too, add a 2nd parameter to set which would be the buffer number, and then add a function to switch between buffers.

but sharing code between aleph and modules is a good argument in favour of not using the buffer/dirty flag directly as the set functions could be (already are?) exposed as ops in bees? this would make porting the code much easier i assume?

edit - some of the questions answered while i was typing this

arrays in C are sugar for pointers. if you needed to swap states in an app, could do something (roughly) like:

#include "monome.h"
u8 localLedBuffer[MONOME_MAX_LED_BYTES];

void storeLedState() { 
    memcpy(localLedBuffer, monomeLedBuffer, MONOME_MAX_LED_BYTES] 
}

void recallLedState() { 
    memcpy(monomeLedBuffer, localLedBuffer, MONOME_MAX_LED_BYTES] 
}

ok, now that i look at it, the ‘raw’ grid operator does manipulate the buffers directly, for no good reason. gah!

i thnk the sequence of events went something like:

  • feverishly coded up monome.h/.c
  • made the raw grid operator. it works, hooray, move on to the next issue.
  • (months pass)
  • add arc support. think, “this code is gross, better add accessor functions.”
  • use accessors in the arc op.
  • totally forget to actually update anything to use the grid accessor functions. (or there was some partial cleanup that never made it into the master branch for whatever reason.)
  • all the module code copies my dumb pattern from the raw grid op and not the cleaner pattern from the arc op.

so, the confusion is completely my fault. i may as well do the work of cleaning things up.

none of this actually a big deal, but wanted to touch base since it’s been so long and communication has been kinda spotty.

another reason to manipulate the dirty flag would be wanting to have a control on when the refresh happens

i guess that’s one way to do it. but the call to refresh is already separate, and it’s triggered from the event loop. in bees (and modules, looks like) the kEventMonomeRefresh event is raised by a timer. the “idiomatic” way to perform refreshes arbitrarily is to raise that event (and not set the timer.)

I’m not sure if the version of GCC that we have on the AVR32 supports link time optimisation (LTO), normally C compilers can’t inline across compilation units. If it does then it would be best to let the compiler decide. If it does’t support LTO, or you want to force GCC’s hand you can place the function entirely in a header and use static inline, e.g. from teletype:

(The inline is optional, but you must have static otherwise you end up with duplicate symbols.)

If you Google “c static inline function in header” you get quite a few hits. I’m not sure how idiomatic it is, but to me it feels nicer than using macros.


:blush:

for swapping states i meant maintaining 2 or more buffers internally but letting the app specify which buffer it updates and which buffer to use for refresh. in this case you don’t need to memcpy and you can still hide the implementation details from apps.

refresh - yeah, makes sense, it’s a cleaner way to trigger a refresh.

it should be a fairly easy / straightforward to update firmwares to use the set functions and kEventMonomeRefresh which hopefully should get us a bit closer to sharing apps between modules and aleph. i should be able to contribute if we decide to do this.

re: inlining
from the gcc docs:

The compiler performs optimization based on the knowledge it has of
the program. Optimization levels -O2 and above, in
particular, enable unit-at-a-time mode, which allows the
compiler to consider information gained from later functions in
the file when compiling a function. Compiling multiple files at
once to a single output file in unit-at-a-time mode allows
the compiler to use information gained from all of the files when
compiling each of them.

tbh, this doesn’t really help me understand how “unit-at-a-time mode” works… i guess it means it can inline across files if all the sources are compiled to a single object, rather than compiled to separate objects and subsquently linked… so yeah you’re right, seems best to explicitly inline when it’s definitely wanted. for gcc, an alternative to using the inline keyword / putting the body in the header, is to use __attribute__((always_inline)) either in the declaration or the definition. (right?)

i thought modules are already using the Refresh event:
[ https://github.com/monome/ansible/blob/master/src/ansible_grid.c#L88 ]

the per-quadrant dirtry flags are just to avoid sending data for more frames than are necessary on each refresh… but honestly this might be one of those “optimizations” that actually causes more hangups than it solves.

hm i see what you mean… i guess one question is, does anyone actually want to do this in an embedded application? and, is it worth complicating the interface to optimize it? we don’t want to force e.g. every operator in bees to allocate its own LED buffer.

my thinking was that it would be a relatively rare operation to want to swap the whole buffer state; something that happens at the level of switching applications or modes, not like, hundreds of times per second… copying 256 sequential bytes between SRAM locations really doesn’t take very long.

but a compromise could be to continue to have the driver maintain its own buffer (and even make it static to that compilation unit), but also add functions telling the driver to point at an external buffer instead?

anyways… this feels a bit like bikeshedding now. i just want to get aleph building and running with libavr32. so for now, my plan is:

  • re-add the accessor functions (and maybe force inlining on them)
  • clean up comment cruft and other crap in the driver
  • change all the bees ops to use the accessors
  • leave the buffers externally visible and leave the module code as it is.

Since we’re on the general topic of grids, seems a good time to air my thoughts on ‘grid refactor’ changes:

the ‘classic’ grid op is not very flexible - e.g you can’t use it to make a good bees-patch to manipulated grains’ patch-matrix.

https://github.com/rick-monster/aleph/blob/grid-refactor/apps/bees/src/ops/op_monome_grid_raw.c

my revised ‘raw’ grid op is a lot more, well… raw! It outputs x, y, value bangs (in that order) on button presses, then you bang x, y, value inputs to light the lights. I demonstrated you can use this in parallel with mem0d, mem1d, mem2d ops to build things like step sequencers ‘from scratch’ as bees patches.

No focus functionality in there - understanding better the ops like ww etc I would like to add that before pull-request. I propose to rename the ‘classic’ grid op to op_monome_classic.c, then rejig the op indexing on my branch so scene compatibility is preserved.

This only bloats the operator list a little bit. Same approach for the life op potentially (life_classic & life_raw). No good reason to ‘break userspace’ - these can be extra features - a ‘minimalist’ set of BEES ops can be another problem for another day…