A tiny MVCC

2025-06-06

I was always fascinated by concurrency programming. It has similar problems as in multi-node distributed systems. Ordering of operations, synchronization, etc.

The best way to learn that is to design some concurrent library. I had this break from day-to-day work and decided to give myself more work by inflicting a problem on myself: can I design a wait-free read data structure? Good opportunity to learn the area. That data structure has many applications I think, the pattern of write-rarely read-heavily is quite common in hot paths.

Versions to the rescue

This is not new. MVCC is used in databases, but often is scoped to databases for some reason. The approach itself is solid, so why not get "inspired" by that? Here's the thing. We could try keeping multiple versions of data in memory, and let the readers always access the latest version. Every write creates a new version of data, and after a successful write, readers gradually transition to reading the new version of data. Should work based on a solely conceptual model.

Memory layout could look like an object pool, where each slot is a distinct version of data. Good enough. The tricky bit would be to actually organize access to that data within the constraints given by the CPU.

Balancing atomics

The fastest we can get is to use atomics. To have safety we have to stick to one or a couple of these. Constraints specified. What's next? Having multiple versions we need to track the current version somewhere. That would be atomic number 1. Then, somehow ensure that readers finished reading a certain version, so it can be safely overwritten. Therefore, we need to track read access to data first. That could be atomic number 2, but we can make use of bit-packing and pack both the version offset in object pool and read counters into one. Why not? No strong reason that should not work. So far so good.

However, I struggled a little with the following problem: tracking read access is one thing, but how do you make sure that the read has finished? Subtract from the same atomic? Could work, but I can spot a problem: when the new version is written, what exactly that counter will reflect? A total mess I believe.

Looks like we need another atomic here. Perhaps per version of data. Store that alongside the data being accessed. What will it reflect? End of read? Probably, since the start of the read has been recorded elsewhere. And then I realized - why not just add these two atomics? One atomic tracks the start of the access to data, the other tracks the end, summed up to zero equals no more "active" reads.

The end result

The implementation is called vlock - a versioned lock. It's indeed tiny - 320 lines of code not counting the comments thrown here and there, and whitespaces. While documentation states 200 lines, it lies - the library has changed since then.

Is it production ready? Depends on how you define the term. It hasn't been battle-tested with billions of operations, but I did some load testing in my current project on a certain HTTP request serving path, and it performed just fine. Guess I will be the one who will run that experiment in production. Risky, but someone has to do that. Nobody likely uses that crate. I did not market it either.

Any benefits of that? First implementation was noticeably faster than the alternatives for reads. Although I went through changing atomic ordering since then after pointing Miri. I might still not get that right, and haven't tested it thoroughly on ARM, so bear that in mind if you actually would want to use that thing.