(Modern) C Programming Tips and Tricks


looking for a nice way to store and recall data from arrays of structs (that have several types in them) to a byte buffer and vice versa! hoping to avoid a manual hack, that I will regret as soon as I have to make changes in them …! thanks for any help!


I recommend something like Protocol Buffers or MsgPack


if you’d like to use JSON, I have nothing but good things to say about YAJL.


thanks for your links!


I looked att json briefly, what are the main pro’s in comparison to simply incrementing by size (in all long but straightforward manner)?! edit, hesitant of taking the plunge :ocean:


The main pro is that it’s pretty much ubiquitous, and human readable. Every language has JSON bindings, and it’s a format that can easily be used with web browsers so very well suited for applications that need to communicate with them.


We’re switching to using MsgPack for our inter-service communications (on Bandcamp) (from a Ruby-only data format) (so that we can begin writing services in lots of languages). MsgPack’s available for so many languages and it’s fast, and simple. I haven’t used the C lib myself tho’.


I agree with all of this.


Cool. We used msgpack for some services in a past workplace and had nothing but good experiences. Primarily Go and Python and some JS.


i’m guilty of this. what’s the preferred method?


Use the “struct” keyword in front of the struct name everywhere. At Adobe our style was to typedef every struct:

typedef struct _t_AGMCMYKColorRec { uint8 cyan; uint8 magenta; uint8 yellow; uint8 black; } AGMCMYKColorRec;

Sometimes with a “t”-style struct name, and sometimes with the same name as the typedef. You have to use the struct name even when typedef’ing if you want to reference the struct within the struct itself, like when making linked lists:

typedef struct PIPropertyChain { struct PIPropertyChain *next; ... } PIPropertyChain;

That used to confuse me (why is it there?) when I thought the typedef struct form was the only way to use structs. Without the typedef it’s just like this:

struct AGMCMYKColorRec { uint8 cyan; uint8 magenta; uint8 yellow; uint8 black; };
void func(struct AGMCMYKColorRec *color) { struct AGMCMYKColorRec otherColor; }

Don’t hate my naming convention. This was a long time ago.


(basically, don’t typedef structs).

fwiw, i’ve seen the exact opposite convention in other highly usable codebases. that is to say, always typedef structs (probably to UpperCamelClassNames), and always access by pointers to those classes. i wouldn’t assume not following the linux kernel style implies “guilt.”

the kernel guidelines themselves are not unambivalent on this:

Lots of people think that typedefs "help readability". Not so. They are
useful only for:

 (a) totally opaque objects (where the typedef is actively used to _hide_
     what the object is).

     Example: "pte_t" etc. opaque objects that you can only access using
     the proper accessor functions.

     NOTE! Opaqueness and "accessor functions" are not good in themselves.
     The reason we have them for things like pte_t etc. is that there
     really is absolutely _zero_ portably accessible information there.

it’s this last paragraph that is kind of debatable. i’m not personally interested in the debate, but i’m aware of plenty of designers who would claim that on on the contrary, opaqueness is good in itself.

i readily admit that aleph codebase is not particularly well-considered in this regard, though it is pretty consistent. structs are typdedef’d and passed by pointer. the main reason i chose this was for consistency with the ASF, which we use a lot.

consistency is king. i’m fine with either convention and it makes sense to use “struct” any time something will ever have its members directly accessible. (within a library or self-contained codebase.)

but it makes sense to favor opacity when writing for other people (library API layer.)

i think the “functions” section of the kernel guide is pretty unarguable. i was trained to avoid GOTOs but they really are very useful in the specific case of centralizing exit points in C, and i should fix my habits in that regard.


for me, keeping the struct keyword there just means one less thing to wonder/worry about when i’m dealing with code. it’s a way i signal to myself that i have the full definition of that struct available at that place in the code and could either stack or heap allocate it. i use typedef’d structs to indicate to myself that i do not have the definition available and thus can only refer to the type via opaque pointers.

when i started out as a C programmer, i typedef’d everything and would enforce the opacity of structs all over my codebase, but i found over time that it just served to complicate things. bare structs are simpler imo.

also, wrt serialising structs or other data: if you’re going to write anything to disk, you need to figure out a format for doing so. the easiest method is just to write structs themselves as binary data out to a file.

struct thing {
    int x, y, z;

struct thing t = {.x = 42, .y = 24, .z = 0};
fwrite(&t, sizeof(t), 1, some_file);

the problem is that now you not only have a dependence on your data types and arrangement, but you also have a dependence on your CPU bits (32 vs 64, can change the size of int), CPU endianness (generally little, but this will bite you when you least expect it), and even your compiler (padding/alignment). and, what if you want to add more variables at a later date? you’ll need some metadata for versioning, at the very least. and what about interop with other applications or languages?

using a serialisation format which is independent of your in-memory representation is, in my opinion, always a good idea. convert to/from said format at the file loading/saving boundary of your application, and deal with versioning there.

as an example, i have a VST synth I’m developing for which I use JSON as the serialisation format (patches look like this if you’re curious). i have a version property that lets me transparently convert old presets to new ones on load, and it’s been a huge win for me because i can compartmentalise all of that logic in the loader. the rest of the synth logic doesn’t have to know anything about it.

and, on the subject of serialisation formats: msgpack looks cool. at work we use CBOR, which is very similar to msgpack except it’s an IETF RFC.


Of course, you would basically never see something like the fwrite example in embedded code. You make custom serialization for your classes/ datatypes and enforce endianness / alignment there. We do this a lot on the aleph going between avr32 and blackfin, or avr32 and filesystem.

And as you say, custom serializers let you deal with versioning. (We also have a json converter for aleph scenes.)

On the flip side, it is rarely (not never) a good idea to enforce endianness in a non-embedded environment if you can help it.

Some of these questions quickly lead to religious-war territory, I’m wary of categorical imperatives :slight_smile:


totally missed that! thanks.


i have heard many a tale about old programs using the technique as their primary save format. just using it to show one end of the playing field, really. :wink:


so tempting though to use one line instead of having to write 500 for your own serialization! especially when that’s 15% of your overall code. but yeah, serialization is definitely something that should be explicit rather than implicit, and worth the time making it future proof…


i guess you’re right, and i kinda take it back; certainly seen some raw memory dumps used as save files. it can even make sense if you’re writing to something like an internal eeprom and the storage is very temporary (saving state between powercycles.)

what i mean is, i would be alarmed to see that in contemporary code that is actually interacting with a filesystem. nowadays most programmers have had to do some web stuff at some point, where the need for robust serialization is obvious.

to that point, it’s cool for me to learn about msgpack and CBOR. for web stuff i’ve mostly used json or protobufs when performance is importnat and the environment supports it. (so, not really relevant for pure C.)


serialisation is also one of the things that rust does really, really well. with serde you can just annotate structs and have serialisation/deserialisation routines generated automatically.

#![feature(custom_derive, plugin)]

extern crate serde_json;

#[derive(Serialize, Deserialize, Debug)]
struct Point {
    x: i32,
    y: i32,

fn main() {
    let point = Point { x: 1, y: 2 };
    let serialized = serde_json::to_string(&point).unwrap();

    println!("{}", serialized);

    let deserialized: Point = serde_json::from_str(&serialized).unwrap();

    println!("{:?}", deserialized);

i really, really like rust so far. definitely worth a check if you’re unfamiliar.



going to look at protobufs, are there any serialization C libraries that would be very lightweight?

could be just the case that something home brewed would be most efficient but would be nice to just use a library.