Guide to Commonly Used Crates

Cheburashka

Cheburashka is the observability framework used by our other crates. It provides standardized logging, metrics, labeling and (in the future) tracing capabilities, mostly by re-exporting third party crates. See the Cheburashka README , and the Rust docs for more information.

Matroskin

Matroskin is the Rust gRPC framework which we use to generate Rust code from protobuf specifications. Internally it mostly re-uses tonic and prost, with some additional customizations specific to our use cases. See the README in matroskin/ for more details.

Yozhik

Yozhik provides a high-level Rust abstraction over our Windows and Linux change tracking drivers. In both cases, these drivers allow us to take point-in-time snapshots of block devices, and quickly determine which blocks on the device changed since our last snapshot. This is fundamental to our ability to backup systems in cases where agentless snapshots are not possible.

Elastio-agentless

Elastio-agentless is equivalent to Yozhik, but operating on systems that provide APIs to take snapshots of disks and read changed blocks from them without running any code on the system being snapped. This includes Amazon EBS volumes, Azure Page Blobs, and VMWare ESXi disks.

Binary-Ids

Binary-ids is a library that provides macros and interfaces for the creation of ID types that consist of opaque fixed-length byte sequences. This was originally conceived as a way to work with SHA hashes more conveniently, but now also supports ULIDs.

The Rust docs provide much more information about how this works.

Kolobok

Kolobok is the crate that implements the file and block device ingest. That is, it takes input from either Yozhik or Elastio-agentless, does a lot of computation, and in the end ingests the data on the source files/discks into Scalez-Stor.

See the Rust docs for more.

Elastio-rocks

We use RocksDB internally for metadata storage in ScaleZ. The existing rust-rocksdb crate was forked, and slightly modified by Elastio to make some internal fields public. Then we build elastio-rocks on top, which provides a lot of functionality missing in the original rust-rocksdb, a Rust idiomatic interface, metrics, logging, async IO, etc. It's extensively documented in the Rust code.

Scalez-KV

Scalez-KV is a Rust native key/value store abstraction layered on top of Elastio-rocks. It lets service developers define a storage layer in terms of tables, where each table is a Rust struct. This also defines traits with which arbitrary Rust types can be used as keys or values, with high performance serialization.

Scalez-Stor

Scalez-Stor is the storage service and API responsible for storing all backup system metadata in RocksDB. It's built on top of Scalez-KV and provides an even higher-level interface, as well as a gRPC API which other components of the system use to talk to it.

Scalez-Stor-cli

The executable is called s0. This is primarily an internal testing tool, which lets us to ingest benchmarks, and run the scalez-stor server whenever we need it.

Elastio-cli

The official command line interface to all Elastio code. The executable is called elastio on Linux, and elastio.exe on Windows. With this we can invoke yozhik, elastio-agentless, scalez-stor, and more.

Xtask

This is a dev cli that is used only during the development and not intended for production distribution. You can invoke it via the cargo alias. Use it to run codegen, install code formatting and generation git pre-commit hook, etc. Run cargo xtask to see documentation on available scripts.