Read more of this story at Slashdot.
One of the most visible signs that GNOME’s infrastructure has grown over the years is the amount of CI traffic that flows through gitlab.gnome.org on any given day. Hundreds of pipelines run in parallel, most of them starting with a git clone or git fetch of the same repository, often at the same commit. All that traffic was landing directly on GitLab’s webservice pods, generating redundant load for work that was essentially identical.
GNOME’s infrastructure runs on AWS, which generously provides credits to the project. Even so, data transfer is one of the largest cost drivers we face, and we have to operate within a defined budget regardless of those credits. The bandwidth costs associated with this Git traffic grew significant enough that for a period of time we redirected unauthenticated HTTPS Git pulls to our GitHub mirrors as a short-term cost mitigation. That measure bought us some breathing room, but it was never meant to be permanent: sending users to a third-party platform for what is essentially a core infrastructure operation is not a position we wanted to stay in. The goal was always to find a proper solution on our own infrastructure.
This post documents the caching layer we built to address that problem. The solution sits between the client and GitLab, intercepts Git fetch traffic, and routes it through Fastly’s CDN so that repeated fetches of the same content are served from cache rather than generating a fresh pack every time.
The problemThe Git smart HTTP protocol uses two endpoints: info/refs for capability advertisement and ref discovery, and git-upload-pack for the actual pack generation. The second one is the expensive one. When a CI job runs git fetch origin main, GitLab has to compute and send the entire pack for that fetch negotiation. If ten jobs run the same fetch within a short window, GitLab does that work ten times.
The tricky part is that git-upload-pack is a POST request with a binary body that encodes what the client already has (have lines) and what it wants (want lines). Traditional HTTP caches ignore POST bodies entirely. Building a cache that actually understands those bodies and deduplicates identical fetches requires some work at the edge.
For a fresh clone the body contains only want lines — one per ref the client is requesting:
0032want 7d20e995c3c98644eb1c58a136628b12e9f00a78 0032want 93e944c9f728a4b9da506e622592e4e3688a805c 0032want ef2cbad5843a607236b45e5f50fa4318e0580e04 ...For an incremental fetch the body is a mix of want lines (what the client needs) and have lines (commits the client already has locally), which the server uses to compute the smallest possible packfile delta:
00a4want 51a117587524cbdd59e43567e6cbd5a76e6a39ff 0000 0032have 8282cff4b31dce12e100d4d6c78d30b1f4689dd3 0032have be83e3dae8265fdc4c91f11d5778b20ceb4e2479 0032have 7d46abdf9c5a3f119f645c8de6d87efffe3889b8 ...The leading four hex characters on each line are the pkt-line length prefix. The server walks back through history from the wanted commits until it finds a common ancestor with the have set, then packages everything in between into a packfile. Two CI jobs running the same pipeline at the same commit will produce byte-for-byte identical request bodies and therefore identical responses — exactly the property a cache can help with.
Architecture overviewThe overall setup involves four components:
The request path for a public or internal repository looks like this:
The Nginx configuration exposes two relevant locations. The first is the internal one used for the CDN proxy leg:
location ^~ /cdn-origin/ { internal; rewrite ^/cdn-origin(/.*)$ $1 break; proxy_pass $cdn_upstream; proxy_ssl_server_name on; proxy_ssl_name <cdn-hostname>; proxy_set_header Host <cdn-hostname>; proxy_set_header Accept-Encoding ""; proxy_http_version 1.1; proxy_buffering on; proxy_request_buffering on; proxy_connect_timeout 10s; proxy_send_timeout 60s; proxy_read_timeout 60s; header_filter_by_lua_block { ngx.header["X-Git-Cache-Key"] = ngx.req.get_headers()["X-Git-Cache-Key"] ngx.header["X-Git-Body-Hash"] = ngx.req.get_headers()["X-Git-Body-Hash"] local xcache = ngx.header["X-Cache"] or "" if xcache:find("HIT") then ngx.header["X-Git-Cache-Status"] = "HIT" else ngx.header["X-Git-Cache-Status"] = "MISS" end } }The header_filter_by_lua_block here is doing something specific: it reads X-Cache from the response Fastly returns and translates it into a clean X-Git-Cache-Status header for observability. The X-Git-Cache-Key and X-Git-Body-Hash are also passed through so that callers can see what cache entry was involved.
The second location is git-upload-pack itself, which delegates all the logic to a Lua file:
location ~ /git-upload-pack$ { client_body_buffer_size 5m; client_max_body_size 5m; access_by_lua_file /etc/nginx/lua/git_upload_pack.lua; header_filter_by_lua_block { local key = ngx.req.get_headers()["X-Git-Cache-Key"] if key then ngx.header["X-Git-Cache-Key"] = key end } proxy_pass http://gitlab-webservice; proxy_http_version 1.1; proxy_set_header Host gitlab.gnome.org; proxy_set_header X-Real-IP $http_fastly_client_ip; proxy_set_header X-Forwarded-For $http_fastly_client_ip; proxy_set_header X-Forwarded-Proto https; proxy_set_header X-Forwarded-Port 443; proxy_set_header X-Forwarded-Ssl on; proxy_set_header Connection ""; proxy_buffering on; proxy_request_buffering on; proxy_connect_timeout 10s; proxy_send_timeout 60s; proxy_read_timeout 60s; }The access_by_lua_file directive runs before the request is proxied. If the Lua script calls ngx.exec("/cdn-origin" .. uri), Nginx performs an internal redirect to the CDN location and the proxy_pass to GitLab is never reached. If the script returns normally (for private repos or non-fetch commands), the request falls through to the proxy_pass.
Building the cache keyThe full Lua script that runs in access_by_lua_file handles both passes of the request. The first pass (client → nginx) does the heavy lifting:
local resty_sha256 = require("resty.sha256") local resty_str = require("resty.string") local redis_helper = require("redis_helper") local redis_host = os.getenv("REDIS_HOST") or "localhost" local redis_port = os.getenv("REDIS_PORT") or "6379" -- Second pass: request arriving from CDN origin fetch. -- Decode the original POST body from the header and restore the method. if ngx.req.get_headers()["X-Git-Cache-Internal"] then local encoded_body = ngx.req.get_headers()["X-Git-Original-Body"] if encoded_body then ngx.req.read_body() local body = ngx.decode_base64(encoded_body) ngx.req.set_method(ngx.HTTP_POST) ngx.req.set_body_data(body) ngx.req.set_header("Content-Length", tostring(#body)) ngx.req.clear_header("X-Git-Original-Body") end return endThe second-pass guard is at the top of the script. When Fastly’s origin fetch arrives, it will carry X-Git-Cache-Internal: 1. The script detects that, reconstructs the POST body from the base64-encoded header, restores the POST method, and returns — allowing Nginx to proxy the real request to GitLab.
For the first pass, the script parses the repo path from the URI, reads and buffers the full request body, and computes a SHA256 over it:
-- Only cache "fetch" commands; ls-refs responses are small, fast, and -- become stale on every push (the body hash is constant so a long TTL -- would serve outdated ref listings). if not body:find("command=fetch", 1, true) then ngx.header["X-Git-Cache-Status"] = "BYPASS" return end -- Hash the body local sha256 = resty_sha256:new() sha256:update(body) local body_hash = resty_str.to_hex(sha256:final()) -- Build cache key: cache_versioning + repo path + body hash local cache_key = "v2:" .. repo_path .. ":" .. body_hashA few things worth noting here. The ls-refs command is explicitly excluded from caching. The reason is that ls-refs is used to list references and its request body is essentially static (just a capability advertisement). If we cached it with a 30-day TTL, a push to the repository would not invalidate the cache — the key would be the same — and clients would get stale ref listings. Fetch bodies, on the other hand, encode exactly the SHAs the client wants and already has. The same set of want/have lines always maps to the same pack, which makes them safe to cache for a long time.
The v2: prefix is a cache version string. It makes it straightforward to invalidate all existing cache entries if we ever need to change the key scheme, without touching Fastly’s purge API.
The POST-to-GET conversionThis is probably the most unusual part of the design:
-- Carry the POST body as a base64 header and convert to GET so that -- Fastly's intra-POP consistent hashing routes identical cache keys -- to the same server (Fastly only does this for GET, not POST). ngx.req.set_header("X-Git-Original-Body", ngx.encode_base64(body)) ngx.req.set_method(ngx.HTTP_GET) ngx.req.set_body_data("") return ngx.exec("/cdn-origin" .. uri)Fastly’s shield feature routes cache misses through a designated intra-POP “shield” node before going to origin. When two different edge nodes both get a MISS for the same cache key simultaneously, the shield node collapses them into a single origin request. This is important for us because without it, a burst of CI jobs fetching the same commit would all miss, all go to origin in parallel, and GitLab would end up generating the same pack multiple times anyway.
The catch is that Fastly’s consistent hashing and shield routing only works for GET requests. POST requests always go straight to origin. Fastly does provide a way to force POST responses into the cache — by returning pass in vcl_recv and setting beresp.cacheable in vcl_fetch — but it is a blunt instrument: there is no consistent hashing, no shield collapsing, and no guarantee that two nodes in the same POP will ever share the cached result. By converting the POST to a GET and encoding the body in a header, we get consistent hashing and shield-level request collapsing for free.
The VCL on the Fastly side uses the X-Git-Cache-Key header (not the URL or method) as the cache key, so the GET conversion is invisible to the caching logic.
Protecting private repositoriesWe cannot route private repository traffic through an external CDN — that would mean sending authenticated git content to a third-party cache. The way we prevent this is a denylist stored in Valkey. Before doing anything else, the Lua script checks whether the repository is listed there:
local denied, err = redis_helper.is_denied(redis_host, redis_port, repo_path) if err then ngx.log(ngx.ERR, "git-cache: Redis error for ", repo_path, ": ", err, " — cannot verify project visibility, bypassing CDN") ngx.header["X-Git-Cache-Status"] = "BYPASS" return end if denied then ngx.header["X-Git-Cache-Status"] = "BYPASS" ngx.header["X-Git-Body-Hash"] = body_hash:sub(1, 12) return end -- Public/internal repo: strip credentials before routing through CDN ngx.req.clear_header("Authorization")If Valkey is unreachable, the script logs an error and bypasses the CDN entirely, treating the repository as if it were private. This is the safe default: the cost of a Redis failure is slightly increased load on GitLab, not the risk of routing private repository content through an external cache. In practice, Valkey runs alongside Nginx on the same node, so true availability failures are uncommon.
The denylist is maintained by gitlab-git-cache-webhook, a small FastAPI service. It listens for GitLab system hooks on project_create and project_update events:
HANDLED_EVENTS = {"project_create", "project_update"} @router.post("/webhook") async def webhook(request: Request, ...) -> Response: ... event = body.get("event_name", "") if event not in HANDLED_EVENTS: return Response(status_code=204) project = body.get("project", {}) path = project.get("path_with_namespace", "") visibility_level = project.get("visibility_level") if visibility_level == 0: await deny_repo(path) else: removed = await allow_repo(path) return Response(status_code=204)GitLab’s visibility_level is 0 for private, 10 for internal, and 20 for public. Internal repositories are intentionally treated the same as public ones here: they are accessible to any authenticated user on the instance, so routing them through the CDN is acceptable. Only truly private repositories go into the denylist.
The key format in Valkey is git:deny:<path_with_namespace>. The Lua redis_helper module does an EXISTS check on that key. The webhook service also ships a reconciliation command (python -m app.reconcile) that does a full resync of all private repositories via the GitLab API, which is useful to run on first deployment or after any extended Valkey downtime.
The Fastly VCLOn the Fastly side, three VCL subroutines carry the relevant logic. In vcl_recv:
if (req.url ~ "/info/refs") { return(pass); } if (req.http.X-Git-Cache-Key) { set req.backend = F_Host_1; if (req.restarts == 0) { set req.backend = fastly.try_select_shield(ssl_shield_iad_va_us, F_Host_1); } return(lookup); }/info/refs is always passed through uncached — it is the capability advertisement step and caching it would cause problems with protocol negotiation. Requests carrying X-Git-Cache-Key get an explicit lookup directive and are routed through the shield. Everything else falls through to Fastly’s default behaviour.
In vcl_hash, the cache key overrides the default URL-based key:
if (req.http.X-Git-Cache-Key) { set req.hash += req.http.X-Git-Cache-Key; return(hash); }And in vcl_fetch, responses are marked cacheable when they come back with a 200 and a non-empty body:
if (req.http.X-Git-Cache-Key && beresp.status == 200) { if (beresp.http.Content-Length == "0") { set beresp.ttl = 0s; set beresp.cacheable = false; return(deliver); } set beresp.cacheable = true; set beresp.ttl = 30d; set beresp.http.X-Git-Cache-Key = req.http.X-Git-Cache-Key; unset beresp.http.Cache-Control; unset beresp.http.Pragma; unset beresp.http.Expires; unset beresp.http.Set-Cookie; return(deliver); }The 30-day TTL is deliberately long. Git pack data is content-addressed: a pack for a given set of want/have lines will always be the same. As long as the objects exist in the repository, the cached pack is valid. The only case where a cached pack could be wrong is if objects were deleted (force-push that drops history, for instance), which is rare and, on GNOME’s GitLab, made even rarer by the Gitaly custom hooks we run to prevent force-pushes and history rewrites on protected namespaces. In those cases the cache version prefix would force a key change rather than relying on TTL expiry.
Empty responses (Content-Length: 0) are explicitly not cached. GitLab can return an empty body in edge cases and caching that would break all subsequent fetches for that key.
ConclusionsThe system has been running in production for a few days now and the cache hit rate on fetch traffic has been overall consistently high (over 80%). If something goes wrong with the cache layer, the worst case is that requests fall back to BYPASS and GitLab handles them directly, which is how things worked before. This also means we don’t redirect any traffic to github.com anymore.
That should be all for today, stay tuned!
Read more of this story at Slashdot.
In our previous episode we wrote a merge sort implementation that runs a bit faster than the one in stdlibc++. The question then becomes, could it be made even faster. If you go through the relevant literature one potential improvement is to do a multiway merge. That is, instead of merging two arrays into one, you merge four into one using, for example, a priority queue.
This seems like a slam dunk for performance.
Why is this so? Maybe there are bugs that cause it to do extra work? Assuming that is not the case, what actually is? Measuring seems to indicate that a notable fraction of the runtime is spent in the priority queue code. Beyond that I got very little to nothing.
The best hypotheses I could come up with has to with the number of comparisons made. A classical merge sort does two if statements per output elements. One to determine which of the two lists has a smaller element at the front and one to see whether removing the element exhausted the list. The former is basically random and the latter is always false except when the last element is processed. This amounts to 0.5 mispredicted branches per element per round.
A priority queue has to do a bunch more work to preserve the heap property. The first iteration needs to check the root and its two children. That's three comparisons for value and two checks whether the children actually exist. Those are much less predictable than the comparisons in merge sort. Computers are really efficient at doing simple things, so it may be that the additional bookkeeping is so expensive that it negates the advantage of fewer rounds.
Or maybe it's something else. Who's to say? Certainly not me. If someone wants to play with the code, the implementation is here. I'll probably delete it at some point as it does not have really any advantage over the regular merge sort.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
I post more and more content on my website. What was visible at glance then is now more difficult to look for. I wanted to implement search, but it is a static website. It means that everything is built once, and then published somewhere as final, immutable pages. I can't send a request for search and get results in return.
Or that's what I thought! Pagefind is a neat javascript library that does two things:
The pagefind-modal component looks up the index when the user types a request. The index is a static file, so there is not need for a backend that processes queries. Of course this only works for basic queries, but it's a great rool already!
Pagefind is also easy to customize via a list of CSS variables. Adding it to this website was very straightforward.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
Read more of this story at Slashdot.