You are here

Planet GNOME

Subscribe to Feed Planet GNOME
Planet GNOME - https://planet.gnome.org/
Përditësimi: 1 ditë 16 orë më parë

Bilal Elmoussaoui: goblin: A Linter for GObject C Code

Sht, 11/04/2026 - 2:00pd

Over the past week, I’ve been building goblin, a linter specifically designed for GObject-based C codebases.

If you know Rust’s clippy or Go’s go vet, think of goblin as the same thing for GObject/GLib.

Why this exists

A large part of the Linux desktop stack (GTK, Mutter, Pango, NetworkManager) is built on GObject. These projects have evolved over decades and carry a lot of patterns that predate newer GLib helpers, are easy to misuse, or encode subtle lifecycle invariants that nothing verifies.

This leads to issues like missing dispose/finalize/constructed chain-ups (memory leaks or undefined behavior), incorrect property definitions, uninitialized GError* variables, or function declarations with no implementation.

These aren’t theoretical. This GTK merge request recently fixed several missing chain-ups in example code.

Despite this, the C ecosystem lacks a linter that understands GObject semantics. goblin exists to close that gap.

What goblin checks

goblin ships with 35 rules across different categories:

  • Correctness: Real bugs like non-canonical property names, uninitialized GError*, missing PROP_0
  • Suspicious: Likely mistakes like missing implementations or redundant NULL checks
  • Style: Idiomatic GLib usage (g_strcmp0, g_str_equal())
  • Complexity: Suggests modern helpers (g_autoptr, g_clear_*, g_set_str())
  • Performance: Optimizations like G_PARAM_STATIC_STRINGS or g_object_notify_by_pspec()
  • Pedantic: Consistency checks (macro semicolons, matching declare/define pairs)

23 out of 35 rules are auto-fixable. You should apply fixes one rule at a time to review the changes:

goblin --fix --only use_g_strcmp0 goblin --fix --only use_clear_functions CI/CD Integration

goblin fits into existing pipelines.

GitHub Actions - name: Run goblin run: goblin --format sarif > goblin.sarif - name: Upload SARIF results uses: github/codeql-action/upload-sarif@v3 with: sarif_file: goblin.sarif

Results show up in the Security tab under "Code scanning" and inline on pull requests.

GitLab CI goblin: image: ghcr.io/bilelmoussaoui/goblin:latest script: - goblin --format sarif > goblin.sarif artifacts: reports: sast: goblin.sarif

Results appear inline in merge requests.

Configuration

Rules default to warn, and can be tuned via goblin.toml:

min_glib_version = "2.40" # Auto-disable rules for newer versions [rules] g_param_spec_static_name_canonical = "error" # Make critical use_g_strcmp0 = "warn" # Keep as warning use_g_autoptr_inline_cleanup = "ignore" # Disable # Per-rule ignore patterns missing_implementation = { level = "error", ignore = ["src/backends/**"] }

You can adopt it gradually without fixing everything at once.

Try it # Run via container podman run --rm -v "$PWD:/workspace:Z" ghcr.io/bilelmoussaoui/goblin:latest # Install locally cargo install --git https://github.com/bilelmoussaoui/goblin goblin # Usage goblin # Lint current directory goblin --fix # Apply automatic fixes goblin --list-rules # Inspect available rules

The project is early, so feedback is especially valuable (false positives, missing checks, workflow issues, etc.).

Thibault Martin: TIL that Kubernetes can give you a shell into a crashing container

Pre, 10/04/2026 - 10:00pd

When a container crashes, it can be for several reasons. Sometimes the log won't tell you much about why the container crashed, and you can't get a shell into that container because... it has already crashed. It turns out that kubectl debug can let you do exactly that.

I was trying to ship Helfertool on our Kubernetes cluster. The firs step was to get it to work locally in my Minikube. The container I was deploying kept crashing, with an error message that put me on the right track: Cannot write to log directory. Exiting.

The container expected me to mount a volume on /log so it could write logs, which I did. I wanted to run a quick test from within the container to see if I could create a file in that directory. But when your container has already crashed you can't get a shell into it.

My better informed colleague Quentin told me about kubectl debug, a command that lets me create a copy of the crashing container but with a different COMMAND.

So instead of running its normal program, I can ask the container to run sh with the following command

$ kubectl debug mypod -it \ --copy-to=mypod-debug \ --container=my-pods-image \ -- sh

And just like that I have shell inside a similar container. Using this trick I could confirm that I can't touch a file in that /log directory because it belongs to root while my container is running unprivileged.

That's a great trick to troubleshoot from within a crashing container!

This Week in GNOME: #244 Recognizing Hieroglyphs

Pre, 10/04/2026 - 2:00pd

Update on what happened across the GNOME project in the week from April 03 to April 10.

GNOME Core Apps and Libraries Blueprint

A markup language for app developers to create GTK user interfaces.

James Westman reports

blueprint-compiler is now available on PyPI. You can install it with pip install blueprint-compiler.

GNOME Circle Apps and Libraries Hieroglyphic

Find LaTeX symbols

FineFindus reports

Hieroglyphic 2.3 is out now. Thanks to the exciting work done by Bnyro, Hieroglyphic can now also recognize Typst symbols (a modern alternative to LaTeX). Hardware-acceleration will now be preferred, when available, reducing power-consumption.

Download the latest version from FlatHub.

Amberol

Plays music, and nothing else.

Emmanuele Bassi says

Amberol 2026.1 is out, using the GNOME 50 run time! This new release fixes a few issues when it comes to loading music, and has some small quality of life improvements in the UI, like: a more consistent visibility of the playlist panel when adding songs or searching; using the shortcuts dialog from libadwaita; and being able to open the file manager in the folder containing the current song. You can get Amberol on Flathub.

Third Party Projects

Alexander Vanhee says

A new version of Bazaar is out now. It features the ability to filter search results via a new popover and reworks the add-ons dialog to include a page that shows more information about a specific entry. If you try to open an add-on via the AppStream scheme, it will now display this page, which is useful when you want to redirect users to install an add-on from within your app.

Also, please take a look at the statistics dialog — it now features a cool gradient.

Check it out on Flathub

dabrain34 reports

GstPipelineStudio 0.5.1 is out now. It’s a great pleasure to announce this new version allowing to deal with DOT files directly. Check the project web page for more information or the following blog post for more details about the release.

Anton Isaiev announces

RustConn (connection manager for SSH, RDP, VNC, SPICE, Telnet, Serial, Kubernetes, MOSH, and Zero Trust protocols)

Versions 0.10.9–0.10.14 landed with a solid round of usability, security, and performance work.

Staying connected got easier. If an SSH session drops unexpectedly, RustConn now polls the host and reconnects on its own as soon as it’s back. Wake-on-LAN works the same way: send the magic packet and RustConn connects automatically once the machine boots. You can also right-click any connection to check if the host is online, and a new “Connect All” option opens every connection in a folder at once. For RDP there’s a Mouse Jiggler that keeps idle sessions alive.

Terminal Activity Monitor is a new per-session feature that watches for output activity or silence, which is handy for long-running jobs. You get notifications as tab icons, toasts, and desktop alerts when the window is in the background.

Security got a lot of attention. RDP now defaults to trust-on-first-use certificate validation instead of blindly accepting everything. Credentials for Bitwarden and 1Password are no longer visible in the process list. VNC passwords are zeroized on drop. Export files are written with owner-only permissions. Dangerous custom arguments are blocked for both VNC and FreeRDP viewers.

Hoop.dev joins as the 11th Zero Trust provider. There’s also a new custom SSH agent socket setting that lets Flatpak users connect through KeePassXC, Bitwarden, or GPG-based SSH agents, something the Flatpak sandbox previously made difficult.

Smoother on HiDPI and 4K. RDP frame rendering skips a 33 MB per-frame copy when the data is already in the right format. Highlight rules, search, and log sanitization patterns are compiled once instead of on every keystroke or terminal line.

GNOME HIG polish. Success notifications now use non-blocking toasts instead of modal dialogs. Sidebar context menus are native PopoverMenus with keyboard navigation and screen reader support. Translations completed for all 15 languages.

Project: https://github.com/totoshko88/RustConn Flatpak: https://flathub.org/en/apps/io.github.totoshko88.RustConn

Phosh

A pure wayland shell for mobile devices.

Guido announces

Phosh 0.54 is out:

There’s now a notification when an app fails to start, the status bar can be extended via plugins, and the location quick toggle has a status page to set the maximum allowed accuracy.

On the compositor side we improved X11 support, making docked mode (aka convergence) with applications like emacs or ardour more fun to use.

The on screen keyboard Stevia now supports Japanese and Chinese input via UIM, has a new us+workman layout and automatic space handling can be disabled.

There’s more - see the full details here.

Documentation

Emmanuele Bassi announces

The GNOME User documentation project has been ported to use Meson for its configuration, build, and installation. The User documentation contains the desktop help and the system administration guide, and gets published on the user help website, as well as being available locally through the Help browser. The switch to Meson improved build times, and moved the tests and validation in the build system. There’s a whole new contribution guideline as well. If you want to help writing the GNOME documentation, join us in the Docs room on Matrix!

Shell Extensions Weather O’Clock

Display the current weather inside the pill next to the clock.

Cleo Menezes Jr. reports

Weather O’Clock 50 released with fluffier animations: smooth fades between loading, weather and offline states; instant temperature updates; first-fetch spinner; offline indicator; GNOME Shell 45–50 support; and various bug fixes.

Get it on GNOME Extensions

Follow development

That’s all for this week!

See you next week, and be sure to stop by #thisweek:gnome.org with updates on your own projects!

Jakub Steiner: Moving to Zola

Pre, 10/04/2026 - 2:00pd

I've finally gotten around to porting this blog over to Zola. I've been running on Jekyll for years now, after originally conceiving this blog in Middleman (and PHP initially). But time catches up with everything, and the friction of maintaining Ruby dependencies eventually got to me.

The Speed

I can't stress this enough — Zola is fast. Not "for a static site generator" fast. Just fast. My old Jekyll setup needed a good few seconds to rebuild after a change. Zola builds in milliseconds. The entire site rebuilds almost before I can release the key. It's not critical for a site that gets updated 5 times a year, but it's still impressive.

No Dependencies

This is the big one. Every time you leave a project alone for a few months and come back, you know it's not just going to magically work. The gem versions drift, Bundler gets confused, and suddenly you're down a rabbit hole of version conflicts. The only reason all our Jekyll projects were reasonably easy to work with was locking onto Ruby 3.1.2 using rvm. But at some point the layers of backwardism catch up with you.

Zola is a single binary. That's it. No bundle install, no Gemfile, no "works on my machine" prayers. Download, run, done. It even embeds everything — syntax highlighting, image processing, Sass compilation (if you haven't embraced the modern CSS light yet) — all built-in. The site builds the same on any machine with zero setup.

The Heritage

Zola started life as Gutenberg in 2015/2016, a learning project for Rust by Vincent Prouillet. He was using Hugo before, but hated the Go template engine. That spawned Tera, the Jinja2-inspired template engine that Zola uses.

The project got renamed to Zola in 2018 when the name conflicts with Project Gutenberg got too annoying. It's pure Rust, which means it's fast, memory-safe, and ships as a tiny static binary.

Asset Colocation

One thing I've always focused on for this blog architecture wise is the structure — images and media live right alongside the post, not stuffed into some shared /images/ folder somewhere like most Jekyll sites seem to do. Zola calls this "asset colocation," and it's a first-class feature. No plugins needed. Just put your images in the same folder as your index.md, reference them directly, and Zola handles the rest.

This is how I'd already been running things with Jekyll, so the port was refreshingly painless on that front.

The Templating

The main work was porting the templates. It was the main shostopper when Bilal suggested Zola a couple of years ago. I was hoping something with liquid to pop up, but it seems like people running their own blogs is not a Tik Tok trend. Zola uses Tera instead of Liquid. The syntax is similar enough to get by, but there's enough branches in your path to stumble on. The error messages actually make sense though and point you at the problem, which is a refreshing change from debugging broken Liquid includes.

The Improvements

Beyond speed, I've been cleaning up things the old theme dragged along:

  • Dark mode without JavaScript: The original Klise theme injected a script to toggle themes. The new setup uses CSS-only theming via custom properties, no flash of wrong theme, no JS required.
  • Legibility: I'm getting older, and apparently so are my readers. Font sizes bumped up, contrast dialled in. What looked crisp at 30 looks muddy at 50.

The site's cleaner now, light by default, faster to build, and I don't need to invoke Ruby just to write a blog post. The experience was so damn good, it motivated me to jump at a much larger project I'm hopefully going to post about next.

Previously.

Michael Meeks: 2026-04-09 Thursday

Enj, 09/04/2026 - 11:00md

Andy Wingo: wastrel milestone: full hoot support, with generational gc as a treat

Enj, 09/04/2026 - 3:48md

Hear ye, hear ye: Wastrel and Hoot means REPL!

Which is to say, Wastrel can now make native binaries out of WebAssembly files as produced by the Hoot Scheme toolchain, up to and including a full read-eval-print loop. Like the REPL on the Hoot web page, but instead of requiring a browser, you can just run it on your console. Amazing stuff!

try it at home

First, we need the latest Hoot. Build it from source, then compile a simple REPL:

echo '(import (hoot repl)) (spawn-repl)' > repl.scm ./pre-inst-env hoot compile -fruntime-modules -o repl.wasm repl.scm

This takes about a minute. The resulting wasm file has a pretty full standard library including a full macro expander and evaluator.

Normally Hoot would do some aggressive tree-shaking to discard any definitions not used by the program, but with a REPL we don’t know what we might need. So, we pass -fruntime-modules to instruct Hoot to record all modules and their bindings in a central registry, so they can be looked up at run-time. This results in a 6.6 MB Wasm file; with tree-shaking we would have been at 1.2 MB.

Next, build Wastrel from source, and compile our new repl.wasm:

wastrel compile -o repl repl.wasm

This takes about 5 minutes on my machine: about 3 minutes to generate all the C, about 6.6MLOC all in all, split into a couple hundred files of about 30KLOC each, and then 2 minutes to compile with GCC and link-time optimization (parallelised over 32 cores in my case). I have some ideas to golf the first part down a bit, but the the GCC side will resist improvements.

Finally, the moment of truth:

$ ./repl Hoot 0.8.0 Enter `,help' for help. (hoot user)> "hello, world!" => "hello, world!" (hoot user)> statics

When I first got the REPL working last week, I gasped out loud: it’s alive, it’s alive!!! Now that some days have passed, I am finally able to look a bit more dispassionately at where we’re at.

Firstly, let’s look at the compiled binary itself. By default, Wastrel passes the -g flag to GCC, which results in binaries with embedded debug information. Which is to say, my ./repl is chonky: 180 MB!! Stripped, it’s “just” 33 MB. 92% of that is in the .text (code) section. I would like a smaller binary, but it’s what we got for now: each byte in the Wasm file corresponds to around 5 bytes in the x86-64 instruction stream.

As for dependencies, this is a pretty minimal binary, though dynamically linked to libc:

linux-vdso.so.1 (0x00007f6c19fb0000) libm.so.6 => /gnu/store/…-glibc-2.41/lib/libm.so.6 (0x00007f6c19eba000) libgcc_s.so.1 => /gnu/store/…-gcc-15.2.0-lib/lib/libgcc_s.so.1 (0x00007f6c19e8d000) libc.so.6 => /gnu/store/…-glibc-2.41/lib/libc.so.6 (0x00007f6c19c9f000) /gnu/store/…-glibc-2.41/lib/ld-linux-x86-64.so.2 (0x00007f6c19fb2000)

Our compiled ./repl includes a garbage collector from Whippet, about which, more in a minute. For now, we just note that our use of Whippet introduces no run-time dependencies.

dynamics

Just running the REPL with WASTREL_PRINT_STATS=1 in the environment, it seems that the REPL has a peak live data size of 4MB or so, but for some reason uses 15 MB total. It takes about 17 ms to start up and then exit.

These numbers I give are consistent over a choice of particular garbage collector implementations: the default --gc=stack-conservative-parallel-generational-mmc, or the non-generational stack-conservative-parallel-mmc, or the Boehm-Demers-Weiser bdw. Benchmarking collectors is a bit gnarly because the dynamic heap growth heuristics aren’t the same between the various collectors; by default, the heap grows to 15 MB or so with all collectors, but whether it chooses to collect or expand the heap in response to allocation affects startup timing. I get the above startup numbers by setting GC_OPTIONS=heap-size=15m,heap-size-policy=fixed in the environment.

Hoot implements Guile Scheme, so we can also benchmark Hoot against Guile. Given the following test program that sums the leaf values for ten thousand quad trees of height 5:

(define (quads depth) (if (zero? depth) 1 (vector (quads (- depth 1)) (quads (- depth 1)) (quads (- depth 1)) (quads (- depth 1))))) (define (sum-quad q) (if (vector? q) (+ (sum-quad (vector-ref q 0)) (sum-quad (vector-ref q 1)) (sum-quad (vector-ref q 2)) (sum-quad (vector-ref q 3))) q)) (define (sum-of-sums n depth) (let lp ((n n) (sum 0)) (if (zero? n) sum (lp (- n 1) (+ sum (sum-quad (quads depth))))))) (sum-of-sums #e1e4 5)

We can cat it to our repl to see how we do:

Hoot 0.8.0 Enter `,help' for help. (hoot user)> => 10240000 (hoot user)> Completed 3 major collections (281 minor). 4445.267 ms total time (84.214 stopped); 4556.235 ms CPU time (189.188 stopped). 0.256 ms median pause time, 0.272 p95, 7.168 max. Heap size is 28.269 MB (max 28.269 MB); peak live data 9.388 MB.

That is to say, 4.44s, of which 0.084s was spent in garbage collection pauses. The default collector configuration is generational, which can result in some odd heap growth patterns; as it happens, this workload runs fine in a 15MB heap. Pause time as a percentage of total run-time is very low, so all the various GCs perform the same, more or less; we seem to be benchmarking eval more than the GC itself.

Is our Wastrel-compiled repl performance good? Well, we can evaluate it in two ways. Firstly, against Chrome or Firefox, which can run the same program; if I paste in the above program in the REPL over at the Hoot web site, it takes about 5 or 6 times as long to complete, respectively. Wastrel wins!

I can also try this program under Guile itself: if I eval it in Guile, it takes about 3.5s. Granted, Guile’s implementation of the same source language is different, and it benefits from a number of representational tricks, for example using just two words for a pair instead of four on Hoot+Wastrel. But these numbers are in the same ballpark, which is heartening. Compiling the test program instead of interpreting is about 10× faster with both Wastrel and Guile, with a similar relative ratio.

Finally, I should note that Hoot’s binaries are pretty well optimized in many ways, but not in all the ways. Notably, they use too many locals, and the post-pass to fix this is unimplemented, and last time I checked (a long time ago!), wasm-opt didn’t work on our binaries. I should take another look some time.

generational?

This week I dotted all the t’s and crossed all the i’s to emit write barriers when we mutate the value of a field to store a new GC-managed data type, allowing me to enable the sticky mark-bit variant of the Immix-inspired mostly-marking collector. It seems to work fine, though this kind of generational collector still baffles me sometimes.

With all of this, Wastrel’s GC-using binaries use a stack-conservative, parallel, generational collector that can compact the heap as needed. This collector supports multiple concurrent mutator threads, though Wastrel doesn’t do threading yet. Other collectors can be chosen at compile-time, though always-moving collectors are off the table due to not emitting stack maps.

The neat thing is that any language that compiles to Wasm can have any of these collectors! And when the Whippet GC library gets another collector or another mode on an existing collector, you can have that too.

missing pieces

The biggest missing piece for Wastrel and Hoot is some kind of asynchrony, similar to JavaScript Promise Integration (JSPI), and somewhat related to stack switching. You want Wasm programs to be able to wait on external events, and Wastrel doesn’t support that yet.

Other than that, it would be lovely to experiment with Wasm shared-everything threads at some point.

what’s next

So I have an ahead-of-time Wasm compiler. It does GC and lots of neat things. Its performance is state-of-the-art. It implements a few standard libraries, including WASI 0.1 and Hoot. It can make a pretty good standalone Guile REPL. But what the hell is it for?

Friends, I... I don’t know! It’s really cool, but I don’t yet know who needs it. I have a few purposes of my own (pushing Wasm standards, performance work on Whippet, etc), but you or someone you know needs a wastrel, do let me know at wingo@igalia.com: I would love to be able to spend more time hacking in this area.

Until next time, happy compiling to all!

Thibault Martin: TIL that Helix and Typst are a match made in heaven

Enj, 09/04/2026 - 10:00pd

I love Markdown with all my heart. It's a markup language so simple to understand that even people who are not software engineers can use it in a few minutes.

The flip side of that coin if that Markdown is limited. It can let you create various title levels, bold, italics, strikethrough, tables, links, and a bit more, but not so much.

When it comes to more complex documents, most people resort to full fledged office suite like Microsoft Office or LibreOffice. Both have their merits, but office file formats are brittle and heavy.

The alternative is to use another more complex markup language. Academics used to be into LaTeX but it's often tedious to use. Typst emerged more recently as a simpler yet useful markup language to create well formatted documents.

Tinymist is a language server for Typst. It provides the usual services a Language server provides, like semantic highlighting, code actions, formatting, etc.

But it really stands out by providing a live preview feature that keeps your cursor in sync in Helix when you are clicking around in the live preview!

I only had to install it with

$ brew install tinymist

I then configured Helix to use tinymist for Typst documents, enabling live preview along the way. This happens of course in ~/.config/helix/languages.toml

[language-server.tinymist] command = "tinymist" config = { preview.background.enabled = true, preview.background.args = ["--data-plane-host=127.0.0.1:23635", "--invert-colors=never", "--open"] } [[language]] name = "typst" language-servers = ["tinymist"]

A warm thank you to my lovely friend Felix for showing me the live preview mode of tinymist!

Michael Meeks: 2026-04-08 Wednesday

Mër, 08/04/2026 - 11:00md
  • To work early, writing and planning, team call. Dug up an old mail of mine on inflated claims around tenders which seem to become a topic.
  • Encouraging partner call, sync with Chris and then Philippe.
  • Published the next strip: A Community Organizer - "aligning the community"
  • Had the first Collabora Office Technical Committee call - missing Miklos' careful preparation, next week will be better no doubt.
  • Interesting chat with an old FLOSS hand.
  • Band practice in the evening.

Michael Meeks: 2026-04-07 Tuesday

Mar, 07/04/2026 - 11:00md
  • Up early, mail chew, planning call, sync with Julie & Thorsten. Lunch, catch-up with Anna & Andras.
  • Poked at an update in response to TDF's blog with Chris.
  • TDF board call, Staff + Board + MC with no community questions - surreal, Dennis arrived but too late for the questions I suppose.
  • Read Die L%C3%B6sung which someone pointed me to.

Thibault Martin: TIL that Git can locally ignore files

Mar, 07/04/2026 - 7:35md

When editing markdown, I love using Helix (best editor in the world). I rely on three language servers to help me do it:

  • rumdl to check markdown syntax and enforce the rules decided by a project
  • marksman to get assistance when creating links
  • harper-ls to check for spelling or grammar mistakes

All of these are configured in my ~/.config/helix/languages.toml configuration file, so it applies globally to all the markdown I edit. But when I edit This Week In Matrix at work, things are different.

To edit those posts, we let our community report their progress in a Matrix room, we collect them into a markdown file that we then editorialized. This is a perfect fit for Helix (best editor in the world) and its language servers.

Helix has two features that make it a particularly good fit for the job

  1. The diagnostics view
  2. Jumping to next error with ]d

It is possible to filter out pickers, but it becomes tedious to do so. For this project specifically, I want to disable harper-ls entirely. Helix supports per-project configuration by creating a .helix/languages.toml file at the project's root.

It's a good solution to override my default config, but now I have an extra .helix directory that git wants to track. I could add it to the .gitignore, but that would also add it to everyone else's .gitignore, even if they don't use Helix (best editor in the world) yet.

It turns out that there is a local-only equivalent to .gitignore, and it's .git/info/exclude. The syntax is the same as .gitignore but it's not committed.

Update: several people reached out to point out that there are global options to locally ignore files, if you don't need to do it per-project. Those options are:

  • The global ~/.config/git/ignore file, with the same syntax as .gitignore
  • The configuration variable core.excludesFile to specify a file that contains which patterns to ignore, like a .gitignore

I can't believe I didn't need this earlier in my life.

Thibault Martin: TIL that You can filter Helix pickers

Mar, 07/04/2026 - 7:20md

Helix has a system of pickers. It's a pop up window to open files, or open diagnostics coming from a Language Server.

The diagnostics picker displays data in columns:

  • severity
  • source
  • code
  • message

Sometimes it can get very crowded, especially when you have plenty of hints but few actual errors. I didn't know it, but Helix supports filtering in pickers!

By typing %severity WARN I only get warnings. I can even shorten it to %se (and not %s, since source also starts with an s). The full syntax is well documented in the pickers documentation.

Andy Wingo: the value of a performance oracle

Mar, 07/04/2026 - 2:49md

Over on his excellent blog, Matt Keeter posts some results from having ported a bytecode virtual machine to tail-calling style. He finds that his tail-calling interpreter written in Rust beats his switch-based interpreter, and even beats hand-coded assembly on some platforms.

He also compares tail-calling versus switch-based interpreters on WebAssembly, and concludes that performance of tail-calling interpreters in Wasm is terrible:

1.2× slower on Firefox, 3.7× slower on Chrome, and 4.6× slower in wasmtime. I guess patterns which generate good assembly don't map well to the WASM stack machine, and the JITs aren't smart enough to lower it to optimal machine code.

In this article, I would like to argue the opposite: patterns that generate good assembly map just fine to the Wasm stack machine, and the underperformance of V8, SpiderMonkey, and Wasmtime is an accident.

some numbers

I re-ran Matt’s experiment locally on my x86-64 machine (AMD Ryzen Threadripper PRO 5955WX). I tested three toolchains:

  • Compiled natively via cargo / rustc

  • Compiled to WebAssembly, then run with Wasmtime

  • Compiled to WebAssembly, then run with Wastrel

For each of these toolchains, I tested Raven as implemented in Rust in both “switch-based” and “tail-calling” modes. Additionally, Matt has a Raven implementation written directly in assembly; I test this as well, for the native toolchain. All results use nightly/git toolchains from 7 April 2026.

My results confirm Matt’s for the native and wasmtime toolchains, but wastrel puts them in context:

We can read this chart from left to right: a switch-based interpreter written in Rust is 1.5× slower than a tail-calling interpreter, and the tail-calling interpreter just about reaches the speed of hand-written assembler. (Testing on AArch64, Matt even sees the tail-calling interpreter beating his hand-written assembler.)

Then moving to WebAssembly run using Wasmtime, we see that Wasmtime takes 4.3× as much time to run the switch-based interpreter, compared to the fastest run from the hand-written assembler, and worse, actually shows 6.5× overhead for the tail-calling interpreter. Hence Matt’s conclusions: there must be something wrong with WebAssembly.

But if we compare to Wastrel, we see a different story: Wastrel runs the basic interpreter with 2.4× overhead, and the tail-calling interpreter improves on this marginally with a 2.3x overhead. Now, granted, two-point-whatever-x is not one; Matt’s Raven VM still runs slower in Wasm than when compiled natively. Still, a tail-calling interpreter is inherently a pretty good idea.

where does the time go

When I think about it, there’s no reason that the switch-based interpreter should be slower when compiled via Wastrel than when compiled via rustc. Memory accesses via Wasm should actually be cheaper due to 32-bit pointers, and all the rest of it should be pretty much the same. I looked at the assembly that Wastrel produces and I see most of the patterns that I would expect.

I do see, however, that Wastrel repeatedly reloads a struct memory value, containing the address (and size) of main memory. I need to figure out a way to keep this value in registers. I don’t know what’s up with the other Wasm implementations here; for Wastrel, I get 98% of time spent in the single interpreter function, and surely this is bread-and-butter for an optimizing compiler such as Cranelift. I tried pre-compilation in Wasmtime but it didn’t help. It could be that there is a different Wasmtime configuration that allows for higher performance.

Things are more nuanced for the tail-calling VM. When compiling natively, Matt is careful to use a preserve_none calling convention for the opcode-implementing functions, which allows LLVM to allocate more registers to function parameters; this is just as well, as it seems that his opcodes have around 9 parameters. Wastrel currently uses GCC’s default calling convention, which only has 6 registers for non-floating-point arguments on x86-64, leaving three values to be passed via global variables (described here); this obviously will be slower than the native build. Perhaps Wastrel should add the equivalent annotation to tail-calling functions.

On the one hand, Cranelift (and V8) are a bit more constrained than Wastrel by their function-at-a-time compilation model that privileges latency over throughput; and as they allow Wasm modules to be instantiated at run-time, functions are effectively closures, in which the “instance” is an additional hidden dynamic parameter. On the other hand, these compilers get to choose an ABI; last I looked into it, SpiderMonkey used the equivalent of preserve_none, which would allow it to allocate more registers to function parameters. But it doesn’t: you only get 6 register arguments on x86-64, and only 8 on AArch64. Something to fix, perhaps, in the Wasm engines, but also something to keep in mind when making tail-calling virtual machines: there are only so many registers available for VM state.

the value of time

Well friends, you know us compiler types: we walk a line between collegial and catty. In that regard, I won’t deny that I was delighted when I saw the Wastrel numbers coming in better than Wasmtime! Of course, most of the credit goes to GCC; Wastrel is a relatively small wrapper on top.

But my message is not about the relative worth of different Wasm implementations. Rather, it is that performance oracles are a public good: a fast implementation of a particular algorithm is of use to everyone who uses that algorithm, whether they use that implementation or not.

This happens in two ways. Firstly, faster implementations advance the state of the art, and through competition-driven convergence will in time result in better performance for all implementations. Someone in Google will see these benchmarks, turn them into an OKR, and golf their way to a faster web and also hopefully a bonus.

Secondly, there is a dialectic between the state of the art and our collective imagination of what is possible, and advancing one will eventually ratchet the other forward. We can forgive the conclusion that “patterns which generate good assembly don’t map well to the WASM stack machine” as long as Wasm implementations fall short; but having shown that good performance is possible, our toolkit of applicable patterns in source languages also expands to new horizons.

Well, that is all for today. Until next time, happy hacking!

Michael Meeks: 2026-04-06 Monday

Hën, 06/04/2026 - 11:00md
  • Up early, out for a run with J. poked at mail, and web bits. hmm. Contemplated the latest barrage of oddity from TDF.
  • Set too cutting out cardboard pieces to position tools in the workshop with H. spent qiute some time re-arranging the workshop to try to fit everything in. Binned un-needed junk, moved things around happily. Mended an old light - a wedding present from a friend.
  • Relaxed with J. and watched Upload in the evening, tired.

Jussi Pakkanen: Sorting performance rabbit hole

Hën, 06/04/2026 - 5:42md

In an earlier blog post we found out that Pystd's simple sorting algorithm implementations were 5-10% slower than their stdlibc++ counterparts. The obvious follow up nerd snipe is to ask "can we make the Pystd implementation faster than stdlibc++?"

For all tests below the data set used was 10 million consecutive 64 bit integers shuffled in a random order. The order was the same for all algorithms.

Stable sort

It turns out that the answer for stable sorting is "yes, surprisingly easily". I made a few obvious tweaks (whose details I don't even remember any more) and got the runtime down to 0.86 seconds. This is approximately 5% faster than std::stable_sort. Done. Onwards to unstable sort.

Unstable sort

This one was not, as they say, a picnic. I suspect that stdlib developers have spent more time optimizing std::sort than std::stable_sort simply because it is used a lot more.

After all the improvements I could think of were done, Pystd's implementation was consistently 5-10% percent slower. At this point I started cheating and examined how stdlibc++'s implementation worked to see if there are any optimization ideas to steal. Indeed there were, but they did not help.

Pystd's insertion sort moves elements by pairwise swaps. Stdlibc++ does it by moving the last item to a temporary, shifting the array elements onwards and then moving the stored item to its final location. I implemented that. It made things slower.

Stdlibc++'s moves use memmove instead of copying (at least according to code comments).  I implemented that. It made things slower.

Then I implemented shell sort to see if it made things faster. It didn't. It made them a lot slower.

Then I reworked the way pivot selection is done and realized that if you do it in a specific way, some elements move to their correct partitions as a side effect of median selection. I implemented that and it did not make things faster. It did not make them slower, either, but the end result should be more resistant against bad pivot selection so I left it in.

At some point the implementation grew a bug which only appeared with very large data sets. For debugging purposes I reduce the limit where introsort switches from qsort to insertion sort from 16 to 8. I got the bug fixed but the change made sorting a lot slower. As it should.

But this raises a question, namely would increasing the limit from 16 to 32 make things faster? It turns out that it did. A lot. Out of all perf improvements I implemented, this was the one that yielded the biggest improvement. By a lot. Going to 64 elements made it even faster, but that made other algorithms using insertion sort slower, so 32 it is. For now at least.

After a few final tweaks I managed to finally beat stdlibc++. By how much you ask? Pystd's best observed time was 0.754 seconds while stdlibc++'s was 0.755 seconds. And it happened only once. But that's enough for me.

Michael Meeks: 2026-04-05 Sunday

Dje, 05/04/2026 - 11:00md
  • Slept badly; All Saints in the morning - Easter day. Lovely to play with H. and Jenny - fun - lots of people there.
  • Back for a big roast-lamb lunch with visiting family, Tried to sleep somewhat, prepped for the evening service - preaching at the last minute too; ran that, back.
  • Caught up with S&C&boys a bit; bid them 'bye, A. staying; rested somewhat, headed to bed.

Jakub Steiner: Japan

Dje, 05/04/2026 - 2:00pd

Last year we went to Japan to finally visit friends after two decades of planning to. Because they live in Fukuoka, we only ended up visiting Hiroshima, Kyoto and Osaka afterwards. We loved it there and as soon as cheap flights became available, booked another one for Tokio, to be legally allowed to cross off Japan as visited.

Now if I were to book the trip today, I probably wouldn't. It's quite a gamble given the geopolitical situation and Asia running out of oil. But making it back, it's been as good as the first one. Visiting only Tokio with a short trip to Kawaguchiko in the Sakura blooming season worked out great.

At the start of the year I promised myself to shoot my Fuji more. And I don't mean the volcano, I mean the my X-T20. I haven't kept the promise at all, always relying on the iphone. Luckily for the trip I didn't chicken out carrying the extra weight and I think it paid off. I did only take my 35mm, as the desire to carry gear has really faded away with the years. As we walked over 120km in the few days my back didn't feel very young even with the little gear I did have.

While the difference in quality isn't quite visible on Pixelfed or my photo website (I don't post to Instagram anymore), working through the set on a 4K display has been a pleasure. Bigger sensor is a bigger sensor.

Check out more photos on photo.jimmac.eu -- use arrow keys of swipe to navigate the set.

Weeklybeats #13.

I also managed to get both of my weeklybeats tracks done on the flight so that's a bonus too!

Japan is probably quite difficult to live in, but as a tourist you get so much to feast your eyes on. It's like another planet. I hope to find more time to draw some of the awesome little cars and signs and white tiles and electric cables everywhere.

Michael Meeks: 2026-04-04 Saturday

Sht, 04/04/2026 - 11:00md
  • Up earlyish, poked at some work. We all drove to Aldeburgh with the family to begin the sad task of sorting through Bruce's things with Anne, S&C& boys.
  • A somewhat draining day; death is such a sad thing, but good wider family spirit.
  • Picked up fish & chips in Aldeburgh on the way back; tragically helped at the aftermath of an extremely grisly, run-over pedestrian in Aldeburgh high-street, even sadder.