Versioning and Publishing Crates

When Elastio started we did almost all of our work in a single monorepo called elastio. As of this writing in mid 2021 the bulk of our work still happens in the monorepo, but we are gradually moving away from this as parts of our codebase mature enough to justify being moved into their own repos.

As soon as we started doing this we had to address the problem of dependencies between crates in different repos. Everything in one repo is easy; you just reference the other crates by their path. Once the repos split you need to start to version crates and publish them to a cargo registry where the code in other repos can access them.

While we plan to eventually publish most of our data plane and red stack code on crates.io, we're not there yet so we're using a private Cargo registry called Cloudsmith (actually it supports dozens of technologies including npm and docker and Debian, but we're focused here on cargo). We have a Cloudsmith org elastio, and a single private registry private where all of our private artifacts (including non-cargo artifacts) live. When you went through on-boarding you should have received access to this system; you can access it at https://cloudsmith.io/~elastio/repos/private/packages/.

There are many approaches to Rust crate management, but we've settled on a very opinionated, Elastio-specific approach which works for us. This is mostly automated in the elastio/xtask-core repo, which contains both a library crate xtask-core for integration into other repo-specific automations, and a binary cargo-xelastio which can be cargo installd and invoked like cargo xelastio ... through the magic of cargo's extension mechanism. cargo-xelastio has a subcommand publish which implements our publication logic. You won't ever invoke this manually, it's built in to our CI workflows, however you need to know how it works and how to use it to make a release.

Semver Principles

We follow semver when versioning our crates. You probably know at least a little bit about semver, but there are some less-known concepts that we also rely on which you need to understand.

In short, versions are major.minor.patch, typically starting at 0.1.0 for a new crate. Fixing a bug means incrementing the patch number, eg 1.2.3 => 1.2.4. Adding a new capability to the API without making any breaking changes to the existing public interface means incrementing the minor number, eg 1.2.3 => 1.3.0. Note: some breaking changes can be unobvious:

  • items moving from pub to non-pub and vice-versa;
  • items changing their kind, i.e. from a struct to an enum;
  • additions and removals of region parameters to and from an item's declaration;
  • additions and removals of (possibly defaulted) type parameters to and from an item's declaration;
  • changes to the variance of type and region parameters;
  • additions and removals of enum variants (although additions can be non-breaking changes when tagged as non_exhaustive);
  • additions and removals of enum variant- or struct fields;
  • changes from tuple structs or variants to struct variants and vice-versa;
  • changes to a function or method's constness;
  • additions and removals of a self-parameter on methods;
  • additions and removals of (possibly defaulted) trait items;
  • correct handling of "sealed" traits;
  • changes to the unsafety of a trait;
  • type changes of all toplevel items, as well as associated items in inherent impls and trait definitions;
  • additions and removals of inherent impls or methods contained therein;
  • additions and removals of trait impls.

Breaking changes must always increment the major number, even if the change itself doesn't feel "major", like for example renaming a structure or adding an argument to a public method. Eg 1.2.3 => 2.0.0. It's important to adhere to these rules because cargo assumes you do; if you ask for version 1.2.3 of a crate, but version 1.2.10 is available, cargo will assume it should use 1.2.10 unless you have explicitly told it not to. If 1.2.10 is a breaking change, then your downstream deps will break suddenly without their authors understanding why, and they will invent creative nicknames for you involving anatomically improbable contortions and/or livestock.

Semver versions can also contain two other optional components: a pre-release version and build metadata. The pre-release version follows a - and looks like 1.2.3-alpha or 1.2.3-beta.3 or 1.2.3-foo.bar. No assumption is made about the meaning or structure of the pre-release version, however it is assumed that a given major.minor.patch version without any pre-release version should take priority over the same version with a pre-release component, and that given two pre-release components with the same version, the one whose pre-relase component comes last in lexicographic sort order should take priority.

For example: 1.2.3 will be chosen over 1.2.3-alpha, while 1.2.3-bravo will take priority over 1.2.3-alpha, and 1.2.3-zulu will win over 1.2.3-oscar.whiskey.3.

It's also important to note that, within pre-release versions, no assumptions are made about stability. In other words, if from 1.2.3 to 1.2.4 you make a breaking change to the public interface, a plague of locusts will be visited upon your house. On the other hand, if you make two releases of 1.2.3-dev and the second release is in absolutely no way compatible with the first one, no one can complain because you have made it explicit with -dev that this is a pre-release version and not subject to the same stability guarantees.

Build metadata is a bit different in that it never influences version resolution. Two versions with the same major, minor, patch, and pre-release but with different build metadata are considered identical. Cargo even has a bug whereby if there are two such crates on a registry, differing only by their build metadata, cargo shits the bed. Build metadata follows a + and takes the same form as pre-release version.

Versioning in Workspaces

Within a workspace (meaning within a Git repo in our case), all crates always have the same version. This is a really important simplifying assumption. For an example of how complicated things get when this assumption doesn't hold, look at the Tokio tracing project. It's a nightmare and not one we wish to live with.

So this means that if you have a workspace with awesome-core, awesome-lib and awesome-cli, and you make a small bug fix to awesome-cli, you will be releasing a new patch version of all three of those crates, even though only awesome-cli changed. In practice this isn't a big deal; actual published crates are very lightweight (tens of KB typically) and cargo generally is smart enough to automatically pick up the latest patch release of crates unless you deliberately pin your dependency to a specific version.

There's a command cargo xelastio version which will determine the version of all crates in the workspace and print it to stdout, or fail with an error if not all crates have the same version.

Versioning and Publishing in master

The code in master is never assumed to be suitable for release. The master version of the crates always has a pre-release version dev. Every successful build on master is published to Cloudsmith with this pre-release modifier and also with build metadata containing the git commit hash and the commit date, in a form $hash.YYYY-MM-DD.

Note that due to the aforementioned cargo bug (#7180), we only keep the most recent version of master with a given major.minor.patch version published on Cloudsmith. At publish time the previous versions are automatically deleted. This in practice doesn't matter anyway because build metadata are never used to resolve dependenies, so even if you tried to make a dependency on some older version with different build metadata, cargo will ignore that metadata when it resolves the dependency.

So, you might ask, why bother embedding the build metadata at all? Because it's nonetheless exposed as the crate version, which means we can use the native Cargo crate version as a log or telemetry label or for other diagnostic purposes and it always includes information about the date and commit that code came from. This is helpful information and costs us nothing to maintain since it's built in to the version structure used by cargo.

Versioning and Publishing of releases

While it's possible for you to write code that depends on a -dev version of some other crate, you should never make any assumptions about the stability of a -dev version. We publish dev versions to Cloudsmith because sometimes it's useful to be able to do experimental work on two crates in two repos at the same time and still be able to refer to the crate in the other repo, but you should never release a crate that has a dev dependency because it's almost guaranteed to end badly.

Thus, after you've made some changes and landed them to master and it's time for other crates in the ecosystem to be able to use this new version, you need to make a proper release. You do this by manually running a Github Action workflow, typically called release. When you go into the GitHub UI, or using the gh CLI, you will be asked to provide a value for a parameter bumpLevel. This is a string, either major, minor, or patch, and tells the release process which component of the version should be incremented. Think carefully about this, and bear in mind the Semver rules.

Whichever bump level you pick, the following will happen:

  • The current master is taken as the starting point
  • If there is a prerelease version like dev it's removed
  • If the bumpLevel is patch, if there was a prerelease version then no change is made (in effect it goes from 1.2.3-dev to 1.2.3), if there was no prerelease version then the patch number is incremented
  • If the bumpLevel is something else, either the minor or major version is incremented in accordance with Semver
  • All crates are packaged and published to Cloudsmith
  • The changes with the new versions are committed to master with a comment about preparing for release
  • A tag vX.Y.Z is created, where X.Y.Z is the released version (never with prerelease version or build metadata)
  • A new dev version is made by incrementing the patch level of the released version and adding a dev prerelease. Eg, if we just released 1.2.3, master is modified so all crates now have a version 1.2.4-dev.
  • That change is also committed to master with a comment about preparing the next dev release
  • All changes to git are pushed to GitHub

Deviations from the Standard

This section describes the standard process which is implemented in xtask-core and cargo-xelastio. Not all repos will use this exact process; if you're using a repo that has a different publishing approach you probably know about it already. If you're not sure ask your teamlead. If you are the teamlead, ask @anelson. If you are @anelson then God help us!