Ah. This is finally a relevant thread for me.
After years and years of thinking about it, I recently decided to take a stab at building a system for organizing and curating my various sample libraries and collections. It’s still a growing and evolving, but I think I have a pretty solid foundation.
The TL;DR is that I’m attempting to turn my sample collections into something resembling a centralized Zettelkasten, and storing it in SQLite. My approach is to try manually catalogue, tag, document, annotate, etc content in a centralized place, and then build out solutions that can integrate with this database format.
A Zettelkasten is a means for storing and retrieving information in a very granular fashion. Well, sample libraries are like that as well. Zettelkastens usually have some sort of cross-referencing capabilities. Over time, this network of connections allows for interesting relationships to emerge. I’m hoping if I keep at this long enough, these sort of relationships and structures between samples and sounds will naturally form.
Now, for the tools and implementations.
A few weeks ago I started building a Generic Zettelkasten (the “zet”) for myself, which is part of the static wiki generator I built for myself about year or so ago. The Zet structure is essentially a key/value database with timestamps. Using this structure, I can make things and link things to things. Every “thing” has a UUID4 tag as the key (I use the uuid4 library by rxi to generate these). Keys don’t have to be unique, which allows a single item to have multiple attributes.
One of “things” a thing can be is a file path, presumably pointing to a particular sample. Things can also be things like messages, groups, and references to other things (via the UUID address). This provides the baseline for an annotation system and a tagging system.
My Zet and Wiki are both stored as SQLite tables. In addition to having an expressive querying language, SQLite also has goodies like full text search which make it a really powerful file format.
SQLite also has a strange experimental archive format called SQLar, which I’ve been very intrigued by. I’ve build a bridge between my Zet and SQLar which is called Crate (inb4 Rust did it first). Crate makes it easy to automatically import files stored in a SQLar archive into a Zet database.
The Zet can be imported/exported using a tab-separated-value plaintext file. This allows the Zet to be managed using source control. The Zet could then be cloned onto a new computer, and then “compiled” into a SQLite database. The repo would only manage the metadata, not the samples themselves. The samples would be contained inside of SQLar archives as small self-contained collections (presumably on some external drive), and then copied into the compiled SQLite database when they are needed to be used. Also, I’ve set things up so that TSV files can be broken up into groups. That way, I can configure things in such a way to only partially compile the database if things get too large.
The Zet is included in my static wiki engine. My wiki can be scripted using an embedded version of the Janet scripting language, so conceivably one could potentially use it to generate interesting presentations of the Zet. My efforts here have been quite minimal so far, as I’m mainly focused on just dumping information into the system.
Finally, integration. Fortunately for me, I mostly use a home-brew computer music system, so I can do this with minimal friction. This is also pretty early. Currently, I have implemented the ability to load a WAV file into a buffer if provided with a partial UUID, which can then be used in various samplers and table-lookup oscillators included in my ecosystem. This streamlines sample usage quite a bit for me. Instead of cherry picking samples from various parts of my filesystem, all I have to do is load one SQLite database, and know their UUIDs (which I can figure out using utilities and SQL querying).
This is all being dogfooded right now. I am actually using this right now to build what I hope to be a an end-game centralized sample library for myself. If this actually ends up being something useful, my hope is to publish the wiki parts of the database to my website, as well as make the metadata repo public. We’ll see.
The end.