Glancing at the page for the TCA6408A, one thing I’d watch out for is that you can only pick two different I2C addresses, meaning that you’d end up needing an I2C multiplexer (and therefore much more complexity) to support three or more chips. The TLC59116-Q1 supports 14 different hardware addresses, so a multiplexer would only be necessary if you want to make a DIY 256 grid 
I usually get my PCBs from dirtyPCBs who do a significant discount for smaller boards, but even so I’d be inclined to pay slightly more for a full 8x8 board. The minimal extra cost will be saved in time and materials required to interface boards, and the resulting added vectors for mistakes and hardware failure.
DirtyPBCs have discounted rates up to 100x100mm boards, could 8x8 of these trellis pads just about be fitted on that space? From what I can see, a full row of 8 is (8 × 10mm) + (7 × 5mm) = 115mm, so it’d be juuust too small. I’ve bought larger PCBs from them in the past though and the prices are very reasonable. A “protopack” of 10±2 for my MIDI hurdy gurdy system cost about 60€ for quite large PCBs, and I got 11, so <6€ per board is peanuts compared to the other parts costs.
If/when I have a go at this myself, I would aim for either chainable 8x8 boards and a USB+mother MCU board, or a chainable 8x8 board which, depending on which components are fitted, can act as a main MCU+USB+8x8 board, or a 8x8 daughterboard — slightly more complex design resulting in easier and cheaper construction. I also have various other ideas like fitting USB MIDI and onboard 3.5mm hardware MIDI support though (I don’t want to just build a straight up open source clone of the grid without some sort of added value), so our goals might be divergent.
Plenty of other DIY and work projects to work on first though!
EDIT: just looked at the BOM for @TheSlowGrowth’s Arc clone, and they’re using one TLC5940 per ring with a multiplexer so that one IC can control 64 LEDs, with brightness. Duplicating that approach might be a further argument for an 8x8 board, reducing the part cost, count and amount of fiddly TSSOP soldering.