Serializing data in C

I've started work on a new project! The project in question is one that I have wanted to tackle for a long time but did not have the courage to do so. However at long last it is time...

time to get funky...

I'm writing a music player, and not one that simply plays music like spotify. This music player falls more inline with MPD in which it's a daemon running in the background playing the music and clients may communicate with it to tell it what to play when and the likes.

Currently I'm at that very important part in which the daemon needs to communicate to clients via a protocol *fancy*, and to do so I have decided to go with a socket that way in theory I can control my daemon from other devices on the network. With the communication method decided I now have to define what data goes between devices. At the current moment I've settled on an enum for different signals.

As seen here:

enum libmoo_signal {
  /* core signals are in the < 100 range */
  LIBMOO_SIGNAL_ESTCON = 1,
  LIBMOO_SIGNAL_ESTDCON = 2,
  LIBMOO_SIGNAL_INCOMPATABLE_VERSION = 3,

  LIBMOO_SIGNAL_OK = 50, /* previous message received was ok */
  LIBMOO_SIGNAL_NOK = 51, /* previous message received was not ok */

  /* setting/adding data happens in the 100 range */
  LIBMOO_SIGNAL_ADD_SONG = 101,
  LIBMOO_SIGNAL_SET_PLAY = 103,
  LIBMOO_SIGNAL_SET_PAUSE = 104,
  LIBMOO_SIGNAL_SET_TOGGLE_PAUSE = 105,
  LIBMOO_SIGNAL_SET_SKIP_NEXT = 110,
  LIBMOO_SIGNAL_SET_SKIP_PREV = 111,

  /* removing data happens in the 200 range */
  LIBMOO_SIGNAL_REM_SONG = 201,

  /* getting data happens in the 300 range */
  LIBMOO_SIGNAL_GET_CUR_SONG = 301,
  LIBMOO_SIGNAL_GET_PAUSE = 303,
  LIBMOO_SIGNAL_GET_PROGRESS = 310,
};
    

As for how I actually transmit the signals, I've settled on a format which defaults to 42 bytes of data being sent with the option to send more by setting the 41st-42nd bytes to a uint16 containing the additional number of bytes being sent.

/* byte - data
 *
 * 0-7  - libmoo protocol version
 *
 * 8-31  - reserved for the future
 *
 * 32-40 - libmoo_signal
 *
 * 41-42 - uint16 more_data
 *
 * 43-8191 - char *data
 */
typedef char * libmoo_payload;
    

I'm not sure how this compares to other protocols used to transmit data, but I'm quite proud of how I've laid this out.

Moving on to actual the point of this article: serialization. For the sake of useablilty I've defined a datatype which models our payload which you can see here:

typedef struct {
  enum libmoo_signal signal; /* the signal */
  uint16_t           more_data; /* the number bytes of the additional data */
  char              *data; /* additional data */
} libmoo_data;
    

And a function which allows you to easily create libmoo_data:

/**
 * @brief create a new libmoo_data
 *
 * @param signal the signal to set
 * @param data the data to set
 * @return the data object
 */
libmoo_data *libmoo_create_data(enum libmoo_signal signal, char *data);
    

I've provided this so that you don't have to deal with manually setting the size of the data that you pass into the object, although (if you so choose) you may easily override it.

Serializing

Now that we've gone over the interface for interacting with the payload let's get to the intersting stage in which we get the data ready for launch.

/**
 * @brief convert data into a payload.
 * The payload must be allocated prior to calling this function it should be
 * LIBMOO_PAYLOAD_SIZE chars long unless you think you know better.
 *
 * @param payload pointer to the payload
 * @param data data which will be serialized
 * @return the size of the resulting payload
 */
size_t
libmoo_serialize_data(libmoo_payload payload, libmoo_data *data)
{
  void *ptr;

  ptr = payload;

  /* add LIBMOO_VERSION */
  mempcpy(ptr, LIBMOO_VERSION, strlen(LIBMOO_VERSION));

  /* skip the reserved space */
  ptr += 24;

  /* add the signal type */
  mempcpy(ptr, &data->signal, sizeof(data->signal));

  /* add the size of the more data */
  mempcpy(ptr, &data->more_data, sizeof(data->more_data));

  /* the more data */
  if (data->more_data) {
    mempcpy(ptr, data->data, strlen(data->data));
  }

  return LIBMOO_HEADER_SIZE + sizeof(data->more_data) + data->more_data;
}
    

As you may have noticed in the code snippet above libmoo_serialize_data is a function that takes in a pre-allocated payload, and the data to serialize. I've opted to allow the user to allocate space for the payload if they determine that they know what they're doing incase my dumb function is too generous. Moving onto the body of the function you'll see an unfamiliar function called mempcpy which is a macro for copying data into the payload. This macro takes in the same arguments as memcpy (dest, src, size), but in addition it increments dest by size so I can append data to the payload without making this code horrible to read.

In earlier stages of development I didn't actually return the size of the resulting payload, thinking that I could just strlen it. This was a rookie mistake. Because I've included extra space (bytes 8-31) when you run strlen on any payload it results in 5 (the length of the version string). After finding this out I decided that I do actually want to send more than the version information, and so now libmoo_serialize_data returns a size_t continaing the number of bytes that have been serialized.

Deserializing

Now that we've serialized and presumably sent the data to a client we need to do the magic part which completes this transaction: deserialization.

Below is my method of deserializing the data. It's pretty much the opposite of the serialization function with the key exception that we need to do some validation. Which I haven't implemented yet.

libmoo_data
*libmoo_deserialize_data(libmoo_payload payload, unsigned int payload_size)
{
  libmoo_data *data;
  char libmoo_version[LIBMOO_VERSION_LEN + 1];

  /* ensure that there's enough data to comply to spec */
  if (payload_size < LIBMOO_HEADER_SIZE) {
    errno = EBADE;
    return NULL;
  }

  /* attempt to allocate space and if we fail return NULL errno contains the
   * error
   */
  data = calloc(1, sizeof(libmoo_data));
  if (data == NULL) {
    errno = ENOMEM;
    return NULL;
  }

  /* get the libmoo_version */
  memscpy(libmoo_version, payload, LIBMOO_VERSION_LEN);
  libmoo_version[LIBMOO_VERSION_LEN] = '\0';

  /* TODO: check that the version of the library we're talking to and our
   * version are compatable
   */

  /* gotta tell the client that we're incompatible */
  if (strcmp(libmoo_version, LIBMOO_VERSION) != 0) {
    errno = EINVAL;
    free(data);
    return NULL;
  }

  /* skip the reserved space */
  payload += 24;

  /* get the signal */
  memscpy(&data->signal, payload, sizeof(data->signal));

  /* get the more data marker */
  memscpy(&data->more_data, payload, sizeof(data->more_data));

  /* get the data */
  if (data->more_data) {
    data->data = malloc((data->more_data + 1) * sizeof(char));
    memscpy(data->data, payload, data->more_data * sizeof(char));
    data->data[data->more_data] = '\0'; /* null terminate this string, yo */
  }

  return data;
}
    

And that's kinda it just like with the mempcpy macro I've made a memscpy macro which increments the src by the size of the data to make reading the data easy.

See you next year o/ I might have my project done by then, who knows