Gossip Glomers: Unique ID Generation

Introduction

The company fly.io has a set of challenges, the Gossip Glomers which is meant to to provide a means to test one's ability to work on distributed systems. I've done some work (broadly speaking) on that front but I've felt that I have plenty to learn; hence, why I'm looking at learning a bit more. My hope is that going through these challenges while I'm on sabbatical will be a good way for me to figure out what my knowledge gaps are and what I have to learn.

The first challenge is very simple and is meant to be a means of introducing the challenges and the general development environment they expect (specifically using Maelstrom and Golang). The next bit of work here is to generate unique IDs across a distributed system. This specifically entails generating unique IDs across a system that can withstand random network partitions. There is no expectation that there is some notion of causality (no need for vector clocks or other means of ordering IDs).

UUIDs

A UUID (Universally Unique Identifiers) is a 128-bit number, typically represented in hexadecimal digits and divided by hyphens into 5 groups. The sheer size of this number space is useful for ensuring uniqueness. In fact, the probability of generating two identical UUIDs is astronomically low; it is akin to winning the cosmic lottery. This makes them very useful for ensuring unique IDs in our particular scenario.

Now, I emphasized partition tolerance earlier. I did that because the ability to provide unique IDs in the context of partition tolerance is where UUIDs shine. Specifically, any node can generate a UUID without having to coordinate with another node; thereby, ensuring resilience in the face of partitions.

That being said, there are disadvantages to UUIDs. For example, UUIDs are quite large and that may be too much for some applications. On top of that, they carry no inherent order so if the ability to order IDs is relevant, then UUIDs are a poor fit on their own.

package main

import (
    "encoding/json"
    "log"

    maelstrom "github.com/jepsen-io/maelstrom/demo/go"
    "github.com/google/uuid"
)

func main() {
    n := maelstrom.NewNode()

    // Some boilerplate to fit maelstrom's conventions
    n.Handle("generate", func(msg maelstrom.Message) error {
        // Unmarshal the message body as a loosely-typed map
        var body map[string]any

        if err := json.Unmarshal(msg.Body, &body); err != nil {
            return err
        }

        // The relevant code for generating UUIDs in response to generate requests
        body["type"] = "generate_ok"
        body["id"] = uuid.New().String()

        return n.Reply(msg, body)
    })

    if err := n.Run(); err != nil {
        log.Fatal(err)
    }
}

Conclusion

Overall, this was a fun challenge for me. Admittedly, it was pretty obvious that UUIDs were a good fit here but I think it was a good exercise to think through where UUIDs are useful and where they're not.

If you have any questions about this post, or if you happen to have a posting for someone with my experience, please send an email to contact@javednissar.ca