Avoiding Bugs with Strong Static Typing Using Rust

Cover image

Svix is the enterprise ready webhooks sending service. With Svix, you can build a secure, reliable, and scalable webhook platform in minutes. Looking to send webhooks? Give it a try!

A few weeks ago I wrote a blog post about strong static typing that generated quite a discussion on Hacker News, Reddit, and other places. I read the thousands of comments, and one common theme I noticed is that people who didn't like strong static typing often said it was because types don't catch many classes of errors. As for the people liking types, many liked the examples I shared in the previous post and wanted to see more. This post is for both.

In this post I'll show some of the ways we use the Rust type system at Svix to catch most errors at compile time instead of runtime. For context, our philosophy around typing is that we should enforce as many constraints as possible at compile time, while still being pragmatic and not trying to encode constraints that add little value and are a pain to represent.

Another philosophy we follow is "parse, don't validate". We do the validation at the point of deserialization and encode these constraints in the type system. This means that if we have a string of type Email, we know it's a valid email address otherwise the object wouldn't be possible to initialize.

As one person who emailed me after the post said so succinctly: the code is the contract, and the contract is binding. Like it or not, all of your APIs both internal and external are contracts. Typing is just a way of defining them clearly and explicitly.

About types not catching many classes of errors

As mentioned above, a few people in the comments stated that types are not worth it because they don't catch many classes of errors. To that I say: indeed they don't, but they catch many more than people often realize.

The people making these comments made a few arguments. The first one was that types don't catch all the errors so you need unit tests anyway, so why bother? I think tests are great, and people should definitely be writing tests. Types are not meant to replace automated testing, but they can drastically reduce the number of tests and their complexity. Because when done correctly, they cover a lot of failure cases.

The second argument is that type checking issues are often easy to catch so they are not worth it. I think this is only true on a very basic level (more on that in the next point), but even if it was true, eyelidlessness said it best on HN this is an argument for typing, not against.

The third argument is a subset of the previous one, and it's essentially thinking that type checking means catching mistakes of passing a str instead of an int, so very basic type checking. That is indeed very limited (though still powerful), however the type system lets you do much more.

Wrapper types

We use wrapper types to create new types for commonly used data types, especially when these types are expected to be validated differently or can be easily confused.

For example, you can create an Email type for emails, and make sure that when a variable of this type is created it has to be a valid email (no other way to create it). The same can be extended to model IDs, name, age, secrets, or really anything else that comes to mind. Those are not normal strings or numbers, and they should not be treated as such.

// Returns an error if the email is invalid, so `email` is also a valid `Email`.
let email = Email::from_str("john@example.com")?;

Wrapper types for identifiers

Another area where wrapper types really shine is model IDs. In fact, they work so well for this use-case, that I feel like they deserve their own section.

At Svix we follow a different format for identifiers depending on the model. So for example, an application id will look like app_2GjfkrLnhpMVbNybQgoJsasPWIR while an endpoint id will look like endp_25SVqQSCVpGZh5SmuV0A7X0E3rw. Internally they are both stored as binary ksuids (or uuids) without the prefix, but when interacting with the outside world we implicitly add (or expect) the prefix.

The nice thing about this is that it makes errors much less likely, and it makes debugging extremely easy in many scenarios. A very common error is passing the wrong identifier to an HTTP API call when it's included in the path (e.g. /api/user/{user_id}).

Depending on the endpoint it may be very easy to pass the wrong param type, for example passing a user name instead of a user id. Though because a user id has to be prefixed by the usr_ prefix, passing a username will immediately result in a useful error instead of a confusing 404 ("but I'm 100% sure this username exists!").

As for internally, having wrapper types means that you can't accidentally pass an ApplicationId when an EndpointId is expected.

Consider the following pieces of code:

// It's very easy to confuse the parameters when calling functions if they are both just strings
do_something(app_id, endp_id);

// Owner should actually be an app id, not the endpoint id.
endpoint.owner = id

I'd also like to give a quick shout-out to time-based IDs. I know some people don't like them because they leak object creation time. Though if this is not a concern for use, having time based IDs such as ksuid and uuidv7 is a major boon for debugging. Just the other day we had a customer that complained about data not being sent to a specific endpoint, and it was very easy to spot, just from looking at the ID that the endpoint was created after the supposed issue. So of course it wouldn't have been sent to it.

Parse, don't validate

One common misgiving that kept on coming up in the comments is that the whole idea of types is pointless if you're exposing a JSON API, as JSON is untyped so everything falls apart.

At Svix we follow the "parse, don't validate" idiom. We define the exact structures we expect in the API and we deserialize and validate the data into the structure in the same step. There are many libraries that support this depending on the programming language of your choice including serde for Rust, and pydantic for Python.

Using the wrapper types from the previous sections, our models would look like:

struct UserIn {
    id: UserId,
    group: GroupId,
    email: Email,
    ssn: SocialSecurityNumber,
    ...
}

Where all of the fields are validated before the structure is constructed so we can be certain they follow our constraints.

Another important benefit that this approach gives us, is that we know exactly the schemas that are expected by our API, so we can easily generate an OpenAPI spec, and user-facing docs directly from the code. Making sure those are always correct and match the underlying code.

Define required invariants in code

One nice thing about the above approach where you validate on parsing/creation, is that you can specify your exact invariants and have them validated.

One commenter had this to say:

Consider the article's birthdayGreeting example. Author is happy that static typing catcges the birthdayGreeting("John", "20") bug because "20" is not a number. But birthdayGreeting(" ", 123) is not caught (" " is not a name) and neither is birthdayGreeting("Anna," -12335). birthdayGreeting("Anna" 4.5), though, is caught, which arguably is wrong since 4.5 is an age.

Though as I told him, I believe that these examples illustrate the value of defining your exact invariants rather than the opposite. The issue with the name would be caught if we correctly validate names to not be empty. The age examples will be caught depending on how we decide to validate it. E.g. we can say that in our domain an age has to be a whole number and 0 <= age <= 150, while in the untyped example, we have no way to know whether 5,000 is a valid age, or whether we accept real numbers or not.

Type system state machines

Let's assume we want to implement a sort of a simple state machine. It is very common when implementing codecs and protocols but also appears in many other scenarios. State machines are usually defined my a list of states, associated data, and allowed transitions between them.

Let's consider making a phone call as an example. There are three steps in the process: unlocking the screen, dialing the number, and hitting dial. So for that we will have four states: ScreenLocked, ScreenUnlocked, NumbersEntered, and Dialing (obviously there are more, but let's keep it simple).

We can encode that in the type system as follows:

struct ScreenLocked;
impl ScreenLocked {
    fn unlock(self) -> ScreenUnlocked { ... }
}

struct ScreenUnlocked;
impl ScreenUnlocked {
    fn lock(self) -> ScreenLocked { ... }
    fn enter_key(self, key: u32) -> NumbersEntered { ... }
}

struct NumbersEntered(List<u32>);
impl NumbersEntered {
    fn enter_key(self, key: u32) -> NumbersEntered { ... }
    fn dial(self) -> Dialing;
}

struct Dialing;
impl Dialing {
}

None of the structures should be possible to instantiate outside of the transition functions. This means that in order to reach a specific state in the code, you have to have come from a previous valid state using a valid transition. So in essence the validity of all of the transitions and the state of the code is enforced at compile time and verified by the compiler.

Limiting functionality based on context

We can also use the type system to limit the actions we allow to happen depending on the context and the capabilities.

For example, at Svix we have multiple databases: a global one for user access, a regional one in each region for the data, read replicas for both, and more. Among other things, we want to make sure that we attempt to read the correct model from the correct database, and that we don't attempt to write to the read replica.

We achieve this by tagging each model with the environment it exists in using Rust traits (e.g. global or regional), and then requiring this trait when querying the database. We use a query builder, which makes things a bit easier, but you can use it even without one by tagging the expected returned model (e.g. when using sqlx).

As for limiting writes to read replicas, we enforce it in two ways: (1) we only non-modifying queries to be passed there (e.g. no DELETE or INSERT) by again tagging the queries. We also just don't implement a commit() function on the read replica, so that trying to commit the changes would trigger a compile error.

Enforce typing in generic cache stores

Let's assume you're using some sort of a cache in your code. It could be in-memory, it could be in redis, it doesn't matter. What matters is that its interface accepts a key and a value when setting, and a key to get that value when getting.

One problem with generic cache backends is that they lose typing as they should be able to accept all types of keys and all types of objects. I already covered it at length in the previous post, but in summary: we force the keys and the values to always be tied together, and following the same wrapper type principles from above.

So for example (in very verbose Rust, we have macros that make it dead simple):

// Defined in the cache backend
pub trait CacheKey {
    type Value: CacheValue;
}

pub trait CacheValue { }

// Example usage
pub struct PersonCacheKey(String);

impl CacheKey for PersonCacheKey {
    type: Person;
}

impl CacheValue for Person { }

Then the following cache_save function:

pub fn cache_save
fn set<T: CacheKey>(&self, key: &T, value: &T::Value, ttl: Duration) -> Result<()> { ... }

Can enforce the type always matches the correct value and vice-versa.

Catching missed enum variants

Consider for example the following enum describing the list of sports we support:

enum Sport {
    Basketball,
    Tennis,
    Volleyball,
}

We then have case statements where we act on the value everywhere in the code:

match sport {
    Basketball => { ... },
    Tennis => { ... },
    Volleyball => { ... },
}

We now decide to add support for swimming, and all of a sudden we have to figure out all the places in the codebase where we check the enum types. Essentially an impossible task without types (yes, you can grep Basketball, but it'll be a pain if you have other places that use Basketball, which you probably do), while with typing it'll be an error automatically.

Yes, it's also an error if you don't have static types, but catching it ahead of time requires a very high test coverage and even then, it could be that a function is not yet tested for e.g. swimming (or a function is just being added in parallel by someone else) so it won't get caught.

Types lead to a better development experience

This too I already covered at length in the previous post, but here is the short version copied over verbatim.

Typing can also be used by IDEs and other development tools to vastly improve the development experience. You get notified as you code if any of your expectations are wrong. This significantly reduces cognitive load. You no longer need to remember the types of all the variables and the function in the context. The compiler will be there with you and tell you when something is wrong.

This also leads to a very nice additional benefit: easier refactoring. You can trust the compiler to let you know whether a change you make (e.g. the change in our example above) will break assumptions made elsewhere in the code or not.

Types also make it much easier to onboard new engineers to a codebase or library:

They can follow the type definitions to understand where things are used.
It's much easier to tinker with things as changes will trigger a compile error.

Closing words

These are some of the ways we use the type system at Svix. I'm always curious to learn how others are using the type system to their benefit. Please let me know if you have any cool typing pattern or tricks you would like to share.

For more content like this, make sure to follow us on Twitter, Github or RSS for the latest updates for the Svix webhook service, or join the discussion on our community Slack.