memf—Portable scanf/printf-like functions to marshal binary data

Excuse me if I seem to be a bit harsh, but I do not find this code useful. Correct me if I'm wrong, but from a quick look at the code and examples, what the code does is to take a binary structure (with certain assumptions about the alignment) and convert that into a binary stream with exactly the same ordering, but sans the alignment hassle.

The problem I see is that the binary stream is an exact representation of the source structure, and unpacking the stream requires having a matching (binary wise) definition of the structure on the receiving end. You loose any means of providing backwards compatibility (i.e. the structure must remain the same) as it's not possible to skip/add fields and cannot isolate the wire format from your in-program representation. In fact, I'd say it's equivalent to sending the structure down the wire, and if one is bothered by alignment gaps just adding proper __attribute__((packed)) or #pragma packed to the structure definition. The MBR example is a miss, as the usual way to do it is define a structure in the first place, and just read the data into the structure. Take a look at how GPT header and MBR are defined in Linux kernel.

Now if you changed the API to be more like the sample below it would definitely make things more interesting.

struct foo {
    uint8_t bar;
    uint8_t zed;
    uint32_t blah;
    char foo[10];
};
struct foo f;
mreadf(mbr, "iccd10c", &f.bar, &f.zed, &f.blah, f.foo);
/* say I want to skip blah */
mreadf(mbr, "iccd10c", &f.bar, &f.zed, NULL, f.foo);

/* now say, the code evolves and struct foo has
 * changed in an incompatible way */
struct new_foo {
    uint8_t bar;
    uint32_t blah;
    uint32_t something; 
    char foo[10];
    uint32_t otherthing;
};
struct new_foo nf;
/* assuming foo.zed is no longer relevant
 * for my purpose, but I do care about blah,
 * I can still read the same binary data like this */
mreadf(mbr, "iccd10c", &nf.bar, NULL, &nf.blah, NULL);

Binary serialization using a textual representation similar to what you propose makes sense in languages that do not have a direct access to binary data. I'm thinking in the lines of Python, Perl, Java, Lua. But C/C++/D can do this without using of an intermediate representation. Another common use case is when you do IPC between a number of processes or agents, and not all agents are updated at the same pace, then you need some sort of backward compatibility.

I'd say that you need to provide an added value to justify using memf in C. Obviously, one may argue that not caring about alignment is an added value, why not. However, I like to be explicit about things as low level as ABI. Take for example Google Protocol Buffers, perfectly usable in C, in fact I'm using that on a Cortex-M3 target for sending real time data via MQTT broker to a Java client, another example an ARM host sending data over AMQP, while the receive end is an Erlang app, in both cases there are additional Python clients that only do a graphic presentation of the data. Why use PB in C? What's the added value you can ask? Well, for one PBs offer an efficient packing mechanism that I use. Another thing is bindings to multiple languages (try explaining binary representation to a Java programmer and you'll know the pain).

Finishing up this rather lengthy comment, take a look at GVariant and DBus type system and marshalling.

/r/Cprog Thread Link - github.com