git_cache_http_server.into_rust(): preamble

17 July 2020. Estimated reading time: 3 min.

I originally wrote the git-cache-http-server in 2015, as a proof of concept to ensure the viability of a CI system my company was about to pitch. The system would require cloning fairly large repositories very frequently, and the use of a simple shared local directory was undesirable.

The system would be exposed to the public, and we wanted each job to clone the repository transparently, in isolated containers, keeping the shared cached data outside of reach of broken (intentionally or not) workers.

So I had this idea of a caching proxy server for git: to the clients it would behave just as any normal git HTTP server but, internally, it would forward and cache traffic to the actual remotes as needed.

At first I quickly wrote it as a Node server in JavaScript, but it was very fragile, so I immediately rewrote it in Haxe. We already used Haxe for most things, it's a great strongly typed language, and it can generate sound JavaScript for Node.

And that was it. Git-lfs started to gain traction, we eventually adopted it, and that solved most of the predicted performance problem. On the other hand, the files turned out to be much larger than anticipated, which created other problems in the pipeline; you don't want a 1 GB PDF file! And while there would be a small benefit from caching the files in git-lfs, writing a LFS server was non-trivial. In the end we decided to skip the cache altogether.

Fast forward almost five years... Surprisingly, the project has found a small user base. I made one or two improvements, and also merged a couple of pull requests. There are a few known bugs and missing features, but it mostly works fine. Some moderately sized companies appear to use it.

But it isn't the easiest project to maintain. It requires a reasonable understanding of the git "Smart" protocol over HTTP, and can't remember anything about that anymore. It's also written in a pretty vanilla (for Node) style, full of callbacks. I don't like working on that code.

So maybe it's time for another rewrite, and why not do it in Rust? It's just a little over 200 SLOC, so it's a very small and simple program, great for experimenting with Rust. The domain is also familiar to me, and a have a reference implementation. Lastly, it motivates me to give some attention to this project and, hopefully, fix the open bugs.

Continues in part 2...

Last updated: 08 June 2021.