You are here

Planet GNOME

Subscribe to Feed Planet GNOME
Planet GNOME - https://planet.gnome.org/
Përditësimi: 22 orë 19 sek më parë

Thibault Martin: TIL that Git can locally ignore files

Mar, 07/04/2026 - 7:35md

When editing markdown, I love using Helix (best editor in the world). I rely on three language servers to help me do it:

  • rumdl to check markdown syntax and enforce the rules decided by a project
  • marksman to get assistance when creating links
  • harper-ls to check for spelling or grammar mistakes

All of these are configured in my ~/.config/helix/languages.toml configuration file, so it applies globally to all the markdown I edit. But when I edit This Week In Matrix at work, things are different.

To edit those posts, we let our community report their progress in a Matrix room, we collect them into a markdown file that we then editorialized. This is a perfect fit for Helix (best editor in the world) and its language servers.

Helix has two features that make it a particularly good fit for the job

  1. The diagnostics view
  2. Jumping to next error with ]d

It is possible to filter out pickers, but it becomes tedious to do so. For this project specifically, I want to disable harper-ls entirely. Helix supports per-project configuration by creating a .helix/languages.toml file at the project's root.

It's a good solution to override my default config, but now I have an extra .helix directory that git wants to track. I could add it to the .gitignore, but that would also add it to everyone else's .gitignore, even if they don't use Helix (best editor in the world) yet.

It turns out that there is a local-only equivalent to .gitignore, and it's .git/info/exclude. The syntax is the same as .gitignore but it's not committed.

Update: several people reached out to point out that there are global options to locally ignore files, if you don't need to do it per-project. Those options are:

  • The global ~/.config/git/ignore file, with the same syntax as .gitignore
  • The configuration variable core.excludesFile to specify a file that contains which patterns to ignore, like a .gitignore

I can't believe I didn't need this earlier in my life.

Thibault Martin: TIL that You can filter Helix pickers

Mar, 07/04/2026 - 7:20md

Helix has a system of pickers. It's a pop up window to open files, or open diagnostics coming from a Language Server.

The diagnostics picker displays data in columns:

  • severity
  • source
  • code
  • message

Sometimes it can get very crowded, especially when you have plenty of hints but few actual errors. I didn't know it, but Helix supports filtering in pickers!

By typing %severity WARN I only get warnings. I can even shorten it to %se (and not %s, since source also starts with an s). The full syntax is well documented in the pickers documentation.

Andy Wingo: the value of a performance oracle

Mar, 07/04/2026 - 2:49md

Over on his excellent blog, Matt Keeter posts some results from having ported a bytecode virtual machine to tail-calling style. He finds that his tail-calling interpreter written in Rust beats his switch-based interpreter, and even beats hand-coded assembly on some platforms.

He also compares tail-calling versus switch-based interpreters on WebAssembly, and concludes that performance of tail-calling interpreters in Wasm is terrible:

1.2× slower on Firefox, 3.7× slower on Chrome, and 4.6× slower in wasmtime. I guess patterns which generate good assembly don't map well to the WASM stack machine, and the JITs aren't smart enough to lower it to optimal machine code.

In this article, I would like to argue the opposite: patterns that generate good assembly map just fine to the Wasm stack machine, and the underperformance of V8, SpiderMonkey, and Wasmtime is an accident.

some numbers

I re-ran Matt’s experiment locally on my x86-64 machine (AMD Ryzen Threadripper PRO 5955WX). I tested three toolchains:

  • Compiled natively via cargo / rustc

  • Compiled to WebAssembly, then run with Wasmtime

  • Compiled to WebAssembly, then run with Wastrel

For each of these toolchains, I tested Raven as implemented in Rust in both “switch-based” and “tail-calling” modes. Additionally, Matt has a Raven implementation written directly in assembly; I test this as well, for the native toolchain. All results use nightly/git toolchains from 7 April 2026.

My results confirm Matt’s for the native and wasmtime toolchains, but wastrel puts them in context:

We can read this chart from left to right: a switch-based interpreter written in Rust is 1.5× slower than a tail-calling interpreter, and the tail-calling interpreter just about reaches the speed of hand-written assembler. (Testing on AArch64, Matt even sees the tail-calling interpreter beating his hand-written assembler.)

Then moving to WebAssembly run using Wasmtime, we see that Wasmtime takes 4.3× as much time to run the switch-based interpreter, compared to the fastest run from the hand-written assembler, and worse, actually shows 6.5× overhead for the tail-calling interpreter. Hence Matt’s conclusions: there must be something wrong with WebAssembly.

But if we compare to Wastrel, we see a different story: Wastrel runs the basic interpreter with 2.4× overhead, and the tail-calling interpreter improves on this marginally with a 2.3x overhead. Now, granted, two-point-whatever-x is not one; Matt’s Raven VM still runs slower in Wasm than when compiled natively. Still, a tail-calling interpreter is inherently a pretty good idea.

where does the time go

When I think about it, there’s no reason that the switch-based interpreter should be slower when compiled via Wastrel than when compiled via rustc. Memory accesses via Wasm should actually be cheaper due to 32-bit pointers, and all the rest of it should be pretty much the same. I looked at the assembly that Wastrel produces and I see most of the patterns that I would expect.

I do see, however, that Wastrel repeatedly reloads a struct memory value, containing the address (and size) of main memory. I need to figure out a way to keep this value in registers. I don’t know what’s up with the other Wasm implementations here; for Wastrel, I get 98% of time spent in the single interpreter function, and surely this is bread-and-butter for an optimizing compiler such as Cranelift. I tried pre-compilation in Wasmtime but it didn’t help. It could be that there is a different Wasmtime configuration that allows for higher performance.

Things are more nuanced for the tail-calling VM. When compiling natively, Matt is careful to use a preserve_none calling convention for the opcode-implementing functions, which allows LLVM to allocate more registers to function parameters; this is just as well, as it seems that his opcodes have around 9 parameters. Wastrel currently uses GCC’s default calling convention, which only has 6 registers for non-floating-point arguments on x86-64, leaving three values to be passed via global variables (described here); this obviously will be slower than the native build. Perhaps Wastrel should add the equivalent annotation to tail-calling functions.

On the one hand, Cranelift (and V8) are a bit more constrained than Wastrel by their function-at-a-time compilation model that privileges latency over throughput; and as they allow Wasm modules to be instantiated at run-time, functions are effectively closures, in which the “instance” is an additional hidden dynamic parameter. On the other hand, these compilers get to choose an ABI; last I looked into it, SpiderMonkey used the equivalent of preserve_none, which would allow it to allocate more registers to function parameters. But it doesn’t: you only get 6 register arguments on x86-64, and only 8 on AArch64. Something to fix, perhaps, in the Wasm engines, but also something to keep in mind when making tail-calling virtual machines: there are only so many registers available for VM state.

the value of time

Well friends, you know us compiler types: we walk a line between collegial and catty. In that regard, I won’t deny that I was delighted when I saw the Wastrel numbers coming in better than Wasmtime! Of course, most of the credit goes to GCC; Wastrel is a relatively small wrapper on top.

But my message is not about the relative worth of different Wasm implementations. Rather, it is that performance oracles are a public good: a fast implementation of a particular algorithm is of use to everyone who uses that algorithm, whether they use that implementation or not.

This happens in two ways. Firstly, faster implementations advance the state of the art, and through competition-driven convergence will in time result in better performance for all implementations. Someone in Google will see these benchmarks, turn them into an OKR, and golf their way to a faster web and also hopefully a bonus.

Secondly, there is a dialectic between the state of the art and our collective imagination of what is possible, and advancing one will eventually ratchet the other forward. We can forgive the conclusion that “patterns which generate good assembly don’t map well to the WASM stack machine” as long as Wasm implementations fall short; but having shown that good performance is possible, our toolkit of applicable patterns in source languages also expands to new horizons.

Well, that is all for today. Until next time, happy hacking!

Michael Meeks: 2026-04-06 Monday

Hën, 06/04/2026 - 11:00md
  • Up early, out for a run with J. poked at mail, and web bits. hmm. Contemplated the latest barrage of oddity from TDF.
  • Set too cutting out cardboard pieces to position tools in the workshop with H. spent qiute some time re-arranging the workshop to try to fit everything in. Binned un-needed junk, moved things around happily. Mended an old light - a wedding present from a friend.
  • Relaxed with J. and watched Upload in the evening, tired.

Jussi Pakkanen: Sorting performance rabbit hole

Hën, 06/04/2026 - 5:42md

In an earlier blog post we found out that Pystd's simple sorting algorithm implementations were 5-10% slower than their stdlibc++ counterparts. The obvious follow up nerd snipe is to ask "can we make the Pystd implementation faster than stdlibc++?"

For all tests below the data set used was 10 million consecutive 64 bit integers shuffled in a random order. The order was the same for all algorithms.

Stable sort

It turns out that the answer for stable sorting is "yes, surprisingly easily". I made a few obvious tweaks (whose details I don't even remember any more) and got the runtime down to 0.86 seconds. This is approximately 5% faster than std::stable_sort. Done. Onwards to unstable sort.

Unstable sort

This one was not, as they say, a picnic. I suspect that stdlib developers have spent more time optimizing std::sort than std::stable_sort simply because it is used a lot more.

After all the improvements I could think of were done, Pystd's implementation was consistently 5-10% percent slower. At this point I started cheating and examined how stdlibc++'s implementation worked to see if there are any optimization ideas to steal. Indeed there were, but they did not help.

Pystd's insertion sort moves elements by pairwise swaps. Stdlibc++ does it by moving the last item to a temporary, shifting the array elements onwards and then moving the stored item to its final location. I implemented that. It made things slower.

Stdlibc++'s moves use memmove instead of copying (at least according to code comments).  I implemented that. It made things slower.

Then I implemented shell sort to see if it made things faster. It didn't. It made them a lot slower.

Then I reworked the way pivot selection is done and realized that if you do it in a specific way, some elements move to their correct partitions as a side effect of median selection. I implemented that and it did not make things faster. It did not make them slower, either, but the end result should be more resistant against bad pivot selection so I left it in.

At some point the implementation grew a bug which only appeared with very large data sets. For debugging purposes I reduce the limit where introsort switches from qsort to insertion sort from 16 to 8. I got the bug fixed but the change made sorting a lot slower. As it should.

But this raises a question, namely would increasing the limit from 16 to 32 make things faster? It turns out that it did. A lot. Out of all perf improvements I implemented, this was the one that yielded the biggest improvement. By a lot. Going to 64 elements made it even faster, but that made other algorithms using insertion sort slower, so 32 it is. For now at least.

After a few final tweaks I managed to finally beat stdlibc++. By how much you ask? Pystd's best observed time was 0.754 seconds while stdlibc++'s was 0.755 seconds. And it happened only once. But that's enough for me.

Michael Meeks: 2026-04-05 Sunday

Dje, 05/04/2026 - 11:00md
  • Slept badly; All Saints in the morning - Easter day. Lovely to play with H. and Jenny - fun - lots of people there.
  • Back for a big roast-lamb lunch with visiting family, Tried to sleep somewhat, prepped for the evening service - preaching at the last minute too; ran that, back.
  • Caught up with S&C&boys a bit; bid them 'bye, A. staying; rested somewhat, headed to bed.

Jakub Steiner: Japan

Dje, 05/04/2026 - 2:00pd

Last year we went to Japan to finally visit friends after two decades of planning to. Because they live in Fukuoka, we only ended up visiting Hiroshima, Kyoto and Osaka afterwards. We loved it there and as soon as cheap flights became available, booked another one for Tokio, to be legally allowed to cross off Japan as visited.

Now if I were to book the trip today, I probably wouldn't. It's quite a gamble given the geopolitical situation and Asia running out of oil. But making it back, it's been as good as the first one. Visiting only Tokio with a short trip to Kawaguchiko in the Sakura blooming season worked out great.

At the start of the year I promised myself to shoot my Fuji more. And I don't mean the volcano, I mean the my X-T20. I haven't kept the promise at all, always relying on the iphone. Luckily for the trip I didn't chicken out carrying the extra weight and I think it paid off. I did only take my 35mm, as the desire to carry gear has really faded away with the years. As we walked over 120km in the few days my back didn't feel very young even with the little gear I did have.

While the difference in quality isn't quite visible on Pixelfed or my photo website (I don't post to Instagram anymore), working through the set on a 4K display has been a pleasure. Bigger sensor is a bigger sensor.

Check out more photos on photo.jimmac.eu -- use arrow keys of swipe to navigate the set.

Weeklybeats #13.

I also managed to get both of my weeklybeats tracks done on the flight so that's a bonus too!

Japan is probably quite difficult to live in, but as a tourist you get so much to feast your eyes on. It's like another planet. I hope to find more time to draw some of the awesome little cars and signs and white tiles and electric cables everywhere.

Michael Meeks: 2026-04-04 Saturday

Sht, 04/04/2026 - 11:00md
  • Up earlyish, poked at some work. We all drove to Aldeburgh with the family to begin the sad task of sorting through Bruce's things with Anne, S&C& boys.
  • A somewhat draining day; death is such a sad thing, but good wider family spirit.
  • Picked up fish & chips in Aldeburgh on the way back; tragically helped at the aftermath of an extremely grisly, run-over pedestrian in Aldeburgh high-street, even sadder.

Michael Meeks: <h1>TDF ejects its core developers</h1>

Enj, 02/04/2026 - 8:52md

For a rather more polished write-up, complete with pretty pictures please see TDF ejects its core developers. Here is a more personal take. My feeling is that this action has been planned by the TDF rump board's majority for many months, if not for some years. While we have tried to avoid this outcome, it has been eventually forced on us.

Trends in TDF board membership

There are many great ways to contribute to FLOSS projects and coding is only one of them - let me underline that. However - coding is the primary production that drives many of the other valuable contributions: translation, marketing, etc. We have been blessed to have many excellent developers around LibreOffice but coder's board representation has been declining. This means losing a valuable part of the board's perspective on the complexity of the problem. The elected board is (typically) ten people - seven full board members, and three deputies as spares if needed, here is how that looks:

Another way of looking at board composition is to look at board members' affiliations over time. Of course affiliations change - sometimes during a board term, but this is the same graph broken down by (rough) affiliation:

What is easy to see is the huge drop in corporate affiliation - along with all the business experience that brings. The added '2026' is for the current 'rump' board which continues to over-stay its term and has also lost several of its most popular developer members - including Eike - recently of RedHat (because of a person tragedy) and Bjoern. In 'Interested' I include those with a business interest in LibreOffice who are not part of a larger corporate, one: Laszlo is the last coder on the board.

One of the major surprises of the 2024 election is the 'TDF' chunk in which I bucket paid TDF staff, and those closely related to them. The current chair of the TDF board (Eliane) who manages the Executive Director (ED) is curiously related to a staff member who is managed by the (ED) - arguably an extremely poor governance practice. Having three TDF affiliated directors is also in contradiction of the statutes.

It is also worth noting that for over two years, no Collaboran or any of our partners have been on the TDF board. It was hoped that this would give ample time and space to address any of the issues left from previous boards.

Meritocracy - do-ers decide

TDF is defined as a meritocracy in its statutes. Why is that ? the experience we had from the OpenOffice project was that often those who were doing the work were excluded from decision making. That made it hard to get teams to scale, and to make quick decisions, let leaders grow in areas, and also gives an incentive to contribute more among many other reasons.

Some claim that the sole manifestation of the statute's requirement for meritocracy is a flat entry / membership criteria (as every other organization has). This seems to me to be near to the root of the problems here. Those used to functioning FLOSS projects find it hard to understand why you wouldn't at least listen to those who are working hardest to improve things in whatever area. These days some at TDF seem to emphasize equality instead.

It is interesting then to see the (controversially appointed) Membership Committee overturning the last election - ejecting people without any thanks or apology who have contributed so very much over so many years. We built a quick tool to count those. This excludes the long departed Sun Microsystem's release engineers who committed many other people's patches for them - and it struggles to 'see' into the CVS era where branches were flattened; but ... as far as git and gitdm-alias files can easily tell this is the picture of the top committers to the largest 'core' code repository over all time.

NameCommitsLast commitAffiliation Caolán McNamara37,556Collabora Stephan Bergmann21,732Collabora Noel Grandin20,851Collabora Miklos Vajna10,466Collabora Tor Lillqvist9,233Collabora Michael Stahl8,742Collabora Kohei Yoshida5,655Collabora Eike Rathke5,398Volunteer/RedHat Markus Mohrhard5,230Volunteer Frank Schönheit5,0252011Sun/Oracle Michael Weghorn4,956TDF Mike Kaganski4,864Collabora Andrea Gelmini4,582Volunteer Xisco Fauli4,215TDF Julien Nabet4,031Volunteer Tomaž Vajngerl3,797Collabora David Tardon3,6482021RedHat Luboš Luňák3,201Collabora Hans-Joachim Lankenau3,0072011Sun/Oracle Ocke Janssen2,8522011Sun/Oracle Oliver Specht2,699Sun/Oracle Jan Holesovsky2,689Collabora Mathias Bauer2,5802011Sun/Oracle Olivier Hallot2,561TDF Michael Meeks2,553Collabora Bjoern Michaelsen2,503Volunteer/Canonical Norbert Thiebaud2,1762017Volunteer Thomas Arnhold2,1762014Volunteer Andras Timar2,099Collabora Philipp Lohmann2,0962011Sun/Oracle

It is a humbling privilege for me to serve in such a dedicated team of people who have contributed so much. Take just one example - Cáolan has worked from StarDivision to Sun to Oracle to RedHat to Collabora; 37000 commits in ~25 years - ~four per day sustained, every day. By grokking his commits quickly you can see that this is far more than a job - over 6000 of those were at the weekend, and of course commits don't show the reviews, mentoring, love and care and more. That is just one contributor - but the passion scales across the rest of the team.

Why remove individuals ?

While writing this a response from TDF showed up. While there are things to welcome, it seems that this speculative concern about individual contributors is at the core of the concern:

"people made decisions in the interest of their employers rather than in the interest of The Document Foundation."

Really!? the primary privilege that members of TDF have is voting for their representatives in elections, and this right is earned only by contribution. Elections are secret ballots. So it seems the most plausible reason for disenfranchising so many is a unhealthy fear of the electorate. Is it possible that the board majority want to avoid accountability for their actions at the next election (which is already delayed without adequate explanation), like this:

I have no idea how our staff voted in past elections - but I have to assume they did this with integrity and for the best for TDF as they saw it at the time. It seems that a more plausible reason to remove such long term contributors is electoral gerrymandering.

Some thank yous

After 15+ years of service with LibreOffice, it is unfortunate to be ejected. It is possible to imagine a counter-factual world where this might actually be necessary. But even in this case - to do so with no thank-you, or apology is unconscionable. It is great to see the team making up for that by publicly thanking their colleagues as they are kicked out. I found it deeply encouraging to remember and celebrate all the fantastic work that has been contributed, let me add my own big thank you to everyone!

Where we are now

Well much more can be said, perhaps I'll update this later with more details as they emerge, but for now we're re-focusing on making Collabora Office great, getting our gerrit and CI humming smoothly, and starting to dung-out bits we are not using in the code-base. If you're interested in getting involved have a wave in #cool-dev:matrix.org and join in, we welcome anyone to join us. Thanks for reading and trying to understand this tangled topic !

GNOME Shell and Mutter Development: What is new in GNOME Kiosk 50

Mër, 01/04/2026 - 11:06pd

GNOME Kiosk, the lightweight, specialized compositor continues to evolve In GNOME 50 by adding new configuration options and improving accessibility.

Window configuration User configuration file monitoring

The user configuration file gets reloaded when it changes on disk, so that it is not necessary to restart the session.

New placement options

New configuration options to constrain windows to monitors or regions on screen have been added:

  • lock-on-monitor: lock a window to a monitor.
  • lock-on-monitor-area: lock to an area relative to a monitor.
  • lock-on-area: lock to an absolute area.

These options are intended to replicate the legacy „Zaphod“ mode from X11, where windows could be tied to a specific monitor. It even goes further than that, as it allows to lock windows on a specific area on screen.

The window/monitor association also remains true when a monitor is disconnected. Take for example a setup where each monitor, on a multiple monitors configuration, shows different timetables. If one of the monitors is disconnected (for whatever reason), the timetable showing on that monitor should not be moved to another remaining monitor. The lock-on-monitor option prevents that.

Initial map behavior was tightened

Clients can resize or change their state  before the window is mapped, so size, position, and fullscreen as set from the configuration could be skipped. Kiosk now makes sure to apply configured size, position, and fullscreen on first map when the initial configuration was not applied reliably.

Auto-fullscreen heuristics were adjusted
  • Only normal windows are considered when checking whether another window already covers the monitor (avoids false positives from e.g. xwaylandvideobridge).
  • The current window is excluded when scanning “other” fullscreen sized windows (fixes Firefox restoring monitor-sized geometry).
  • Maximized or fullscreen windows are no longer treated as non-resizable so toggling fullscreen still works when the client had already maximized.
Compositor behavior and command-line options

New command line options have been added:

  • --no-cursor: hides the pointer.
  • --force-animations: forces animations to be enabled.
  • --enable-vt-switch: restores VT switching with the keyboard.

The --no-cursor option can be used to hide the pointer cursor entirely for setups where user input does not involve a pointing device (it is similar to the -nocursor option in Xorg).

Animations can now be disabled using the desktop settings, and will also be automatically disabled when the backend reports no hardware-accelerated rendering for performance purpose. The option --force-animations can be used to forcibly enable animations in that case, similar to GNOME Shell.

The native keybindings, which include VT switching keyboard shortcuts are now disabled by default for kiosk hardening. Applications that rely on the user being able to switch to another console VT on Linux, such as e.g Anaconda, will need to explicit re-enable VT switching using --enable-vt-switch in their session.

These options need to be passed from the command line starting gnome-kiosk, which would imply updating the systemd definitions files, or better, create a custom one (taking example on the the ones provided with the GNOME Kiosk sessions).

Accessibility Accessibility panel An example of an accessibility panel is now included, to control the platform accessibility settings with a GUI. It is a simple Python application using GTK4.

(The gsettings options are also documented in the CONFIG.md file.)

Screen magnifier

Desktop magnification is now implemented, using the same settings as the rest of the GNOME desktop (namely screen-magnifier-enabled, mag-factor, see the CONFIG.md file for details).

It can can be enabled from the accessibility panel or from the keyboard shortcuts through the gnome-settings-daemon’s “mediakeys” plugin.

Accessibility settings

The default systemd session units now start the gnome-settings-daemon accessibility plugin so that Orca (the screen reader) can be enabled through the dedicated keyboard shortcut.

Notifications
  • A new, optional notification daemon implements org.freedesktop.Notifications and org.gtk.Notifications using GTK 4 and libadwaita.
  • A small utility to send notifications via org.gtk.Notifications is also provided.
Input sources GNOME Kiosk was ported to the new Mutter’s keymap API which allows remote desktop servers to mirror the keyboard layout used on the client side. Session files and systemd
    • X-GDM-SessionRegister is now set to false in kiosk sessions as GNOME Kiosk does not register the session itself (unlike GNOME Shell). That fixes a hang when terminating the session.
    • Script session: systemd is no longer instructed to restart the session when the script exits, so that users can logout of the script session when the script terminates.

Matthew Garrett: Self hosting as much of my online presence as practical

Mër, 01/04/2026 - 4:35pd

Because I am bad at giving up on things, I’ve been running my own email server for over 20 years. Some of that time it’s been a PC at the end of a DSL line, some of that time it’s been a Mac Mini in a data centre, and some of that time it’s been a hosted VM. Last year I decided to bring it in house, and since then I’ve been gradually consolidating as much of the rest of my online presence as possible on it. I mentioned this on Mastodon and a couple of people asked for more details, so here we are.

First: my ISP doesn’t guarantee a static IPv4 unless I’m on a business plan and that seems like it’d cost a bunch more, so I’m doing what I described here: running a Wireguard link between a box that sits in a cupboard in my living room and the smallest OVH instance I can, with an additional IP address allocated to the VM and NATted over the VPN link. The practical outcome of this is that my home IP address is irrelevant and can change as much as it wants - my DNS points at the OVH IP, and traffic to that all ends up hitting my server.

The server itself is pretty uninteresting. It’s a refurbished HP EliteDesk which idles at 10W or so, along 2TB of NVMe and 32GB of RAM that I found under a pile of laptops in my office. We’re not talking rackmount Xeon levels of performance, but it’s entirely adequate for everything I’m doing here.

So. Let’s talk about the services I’m hosting.

Web

This one’s trivial. I’m not really hosting much of a website right now, but what there is is served via Apache with a Let’s Encrypt certificate. Nothing interesting at all here, other than the proxying that’s going to be relevant later.

Email

Inbound email is easy enough. I’m running Postfix with a pretty stock configuration, and my MX records point at me. The same Let’s Encrypt certificate is there for TLS delivery. I’m using Dovecot as an IMAP server (again with the same cert). You can find plenty of guides on setting this up.

Outbound email? That’s harder. I’m on a residential IP address, so if I send email directly nobody’s going to deliver it. Going via my OVH address isn’t going to be a lot better. I have a Google Workspace, so in the end I just made use of Google’s SMTP relay service. There’s various commerical alternatives available, I just chose this one because it didn’t cost me anything more than I’m already paying.

Blog

My blog is largely static content generated by Hugo. Comments are Remark42 running in a Docker container. If you don’t want to handle even that level of dynamic content you can use a third party comment provider like Disqus.

Mastodon

I’m deploying Mastodon pretty much along the lines of the upstream compose file. Apache is proxying /api/v1/streaming to the websocket provided by the streaming container and / to the actual Mastodon service. The only thing I tripped over for a while was the need to set the “X-Forwarded-Proto” header since otherwise you get stuck in a redirect loop of Mastodon receiving a request over http (because TLS termination is being done by the Apache proxy) and redirecting to https, except that’s where we just came from.

Mastodon is easily the heaviest part of all of this, using around 5GB of RAM and 60GB of disk for an instance with 3 users. This is more a point of principle than an especially good idea.

Bluesky

I’m arguably cheating here. Bluesky’s federation model is quite different to Mastodon - while running a Mastodon service implies running the webview and other infrastructure associated with it, Bluesky has split that into multiple parts. User data is stored on Personal Data Servers, then aggregated from those by Relays, and then displayed on Appviews. Third parties can run any of these, but a user’s actual posts are stored on a PDS. There are various reasons to run the others, for instance to implement alternative moderation policies, but if all you want is to ensure that you have control over your data, running a PDS is sufficient. I followed these instructions, other than using Apache as the frontend proxy rather than nginx, and it’s all been working fine since then. In terms of ensuring that my data remains under my control, it’s sufficient.

Backups

I’m using borgmatic, backing up to a local Synology NAS and also to my parents’ home (where I have another HP EliteDesk set up with an equivalent OVH IPv4 fronting setup). At some point I’ll check that I’m actually able to restore them.

Conclusion

Most of what I post is now stored on a system that’s happily living under a TV, but is available to the rest of the world just as visibly as if I used a hosted provider. Is this necessary? No. Does it improve my life? In no practical way. Does it generate additional complexity? Absolutely. Should you do it? Oh good heavens no. But you can, and once it’s working it largely just keeps working, and there’s a certain sense of comfort in knowing that my online presence is carefully contained in a small box making a gentle whirring noise.

Andy Wingo: wastrelly wabbits

Mar, 31/03/2026 - 10:34md

Good day! Today (tonight), some notes on the last couple months of Wastrel, my ahead-of-time WebAssembly compiler.

Back in the beginning of February, I showed Wastrel running programs that use garbage collection, using an embedded copy of the Whippet collector, specialized to the types present in the Wasm program. But, the two synthetic GC-using programs I tested on were just ported microbenchmarks, and didn’t reflect the output of any real toolchain.

In this cycle I worked on compiling the output from the Hoot Scheme-to-Wasm compiler. There were some interesting challenges!

bignums

When I originally wrote the Hoot compiler, it targetted the browser, which already has a bignum implementation in the form of BigInt, which I worked on back in the day. Hoot-generated Wasm files use host bigints via externref (though wrapped in structs to allow for hashing and identity).

In Wastrel, then, I implemented the imports that implement bignum operations: addition, multiplication, and so on. I did so using mini-gmp, a stripped-down implementation of the workhorse GNU multi-precision library. At some point if bignums become important, this gives me the option to link to the full GMP instead.

Bignums were the first managed data type in Wastrel that wasn’t defined as part of the Wasm module itself, instead hiding behind externref, so I had to add a facility to allocate type codes to these “host” data types. More types will come in time: weak maps, ephemerons, and so on.

I think bignums would be a great proposal for the Wasm standard, similar to stringref ideally (sniff!), possibly in an attenuated form.

exception handling

Hoot used to emit a pre-standardization form of exception handling, and hadn’t gotten around to updating to the newer version that was standardized last July. I updated Hoot to emit the newer kind of exceptions, as it was easier to implement them in Wastrel that way.

Some of the problems Chris Fallin contended with in Wasmtime don’t apply in the Wastrel case: since the set of instances is known at compile-time, we can statically allocate type codes for exception tags. Also, I didn’t really have to do the back-end: I can just use setjmp and longjmp.

This whole paragraph was meant to be a bit of an aside in which I briefly mentioned why just using setjmp was fine. Indeed, because Wastrel never re-uses a temporary, relying entirely on GCC to “re-use” the register / stack slot on our behalf, I had thought that I didn’t need to worry about the “volatile problem”. From the C99 specification:

[...] values of objects of automatic storage duration that are local to the function containing the invocation of the corresponding setjmp macro that do not have volatile-qualified type and have been changed between the setjmp invocation and longjmp call are indeterminate.

My thought was, though I might set a value between setjmp and longjmp, that would only be the case for values whose lifetime did not reach the longjmp (i.e., whose last possible use was before the jump). Wastrel didn’t introduce any such cases, so I was good.

However, I forgot about local.set: mutations of locals (ahem, objects of automatic storage duration) in the source Wasm file could run afoul of this rule. So, because of writing this blog post, I went back and did an analysis pass on each function to determine the set of locals which are mutated inside a try_block. Thank you, rubber duck readers!

bugs

Oh my goodness there were many bugs. Lacunae, if we are being generous; things not implemented quite right, which resulted in errors either when generating C or when compiling the C. The type-preserving translation strategy does seem to have borne fruit, in that I have spent very little time in GDB: once things compile, they work.

coevolution

Sometimes Hoot would use a browser facility where it was convenient, but for which in a better world we would just do our own thing. Such was the case for the number->string operation on floating-point numbers: we did something awful but expedient.

I didn’t have this facility in Wastrel, so instead we moved to do float-to-string conversions in Scheme. This turns out to have been a good test for bignums too; the algorithm we use is a bit dated and relies on bignums to do its thing. The move to Scheme also allows for printing floating-point numbers in other radices.

There are a few more Hoot patches that were inspired by Wastrel, about which more later; it has been good for both to work on the two at the same time.

tail calls

My plan for Wasm’s return_call and friends was to use the new musttail annotation for calls, which has been in clang for a while and was recently added to GCC. I was careful to limit the number of function parameters such that no call should require stack allocation, and therefore a compiler should have no reason to reject any particular tail call.

However, there were bugs. Funny ones, at first: attributes applying to a preceding label instead of the following call, or the need to insert if (1) before the tail call. More dire ones, in which tail callers inlined into their callees would cause the tail calls to fail, worked around with judicious application of noinline. Thanks to GCC’s Andrew Pinski for help debugging these and other issues; with GCC things are fine now.

I did have to change the code I emitted to return “top types only”: if you have a function returning type T, you can tail-call a function returning U if U is a subtype of T, but there is no nice way to encode this into the C type system. Instead, we return the top type of T (or U, it’s the same), e.g. anyref, and insert downcasts at call sites to recover the precise types. Not so nice, but it’s what we got.

Trying tail calls on clang, I ran into a funny restriction: clang not only requires that return types match, but requires that tail caller and tail callee have the same parameters as well. I can see why they did this (it requires no stack shuffling and thus such a tail call is always possible, even with 500 arguments), but it’s not the design point that I need. Fortunately there are discussions about moving to a different constraint.

scale

I spent way more time that I had planned to on improving the speed of Wastrel itself. My initial idea was to just emit one big C file, and that would provide the maximum possibility for GCC to just go and do its thing: it can see everything, everything is static, there are loads of always_inline helpers that should compile away to single instructions, that sort of thing. But, this doesn’t scale, in a few ways.

In the first obvious way, consider whitequark’s llvm.wasm. This is all of LLVM in one 70 megabyte Wasm file. Wastrel made a huuuuuuge C file, then GCC chugged on it forever; 80 minutes at -O1, and I wasn’t aiming for -O1.

I realized that in many ways, GCC wasn’t designed to be a compiler target. The shape of code that one might emit from a Wasm-to-C compiler like Wastrel is different from that that one would write by hand. I even ran into a segfault compiling with -Wall, because GCC accidentally recursed instead of iterated in the -Winfinite-recursion pass.

So, I dealt with this in a few ways. After many hours spent pleading and bargaining with different -O options, I bit the bullet and made Wastrel emit multiple C files. It will compute a DAG forest of all the functions in a module, where edges are direct calls, and go through that forest, greedily consuming (and possibly splitting) subtrees until we have “enough” code to split out a partition, as measured by number of Wasm instructions. They say that -flto makes this a fine approach, but one never knows when a translation unit boundary will turn out to be important. I compute needed symbol visibilities as much as I can so as to declare functions that don’t escape their compilation unit as static; who knows if this is of value. Anyway, this partitioning introduced no performance regression in my limited tests so far, and compiles are much much much faster.

scale, bis

A brief observation: Wastrel used to emit indented code, because it could, and what does it matter, anyway. However, consider Wasm’s br_table: it takes an array of n labels and an integer operand, and will branch to the nth label, or the last if the operand is out of range. To set up a label in Wasm, you make a block, of which there are a handful of kinds; the label is visible in the block, and for n labels, the br_table will be the most nested expression in the n nested blocks.

Now consider that block indentation is proportional to n. This means, the file size of an indented C file is quadratic in the number of branch targets of the br_table.

Yes, this actually bit me; there are br_table instances with tens of thousands of targets. No, wastrel does not indent any more.

scale, ter

Right now, the long pole in Wastrel is the compile-to-C phase; the C-to-native phase parallelises very well and is less of an issue. So, one might think: OK, you have partitioned the functions in this Wasm module into a number of files, why not emit the files in parallel?

I gave this a go. It did not speed up C generation. From my cursory investigations, I think this is because the bottleneck is garbage collection in Wastrel itself; Wastrel is written in Guile, and Guile still uses the Boehm-Demers-Weiser collector, which does not parallelize well for multiple mutators. It’s terrible but I ripped out parallelization and things are fine. Someone on Mastodon suggested fork; they’re not wrong, but also not Right either. I’ll just keep this as a nice test case for the Guile-on-Whippet branch I want to poke later this year.

scale, quator

Finally, I had another realization: GCC was having trouble compiling the C that Wastrel emitted, because Hoot had emitted bad WebAssembly. Not bad as in “invalid”; rather, “not good”.

There were two cases in which Hoot emitted ginormous (technical term) functions. One, for an odd debugging feature: Hoot does a CPS transform on its code, and allocates return continuations on a stack. This is a gnarly technique but it gets us delimited continuations and all that goodness even before stack switching has landed, so it’s here for now. It also gives us a reified return stack of funcref values, which lets us print Scheme-level backtraces.

Or it would, if we could associate data with a funcref. Unfortunately func is not a subtype of eq, so we can’t. Unless... we pass the funcref out to the embedder (e.g. JavaScript), and the embedder checks the funcref for equality (e.g. using ===); then we can map a funcref to an index, and use that index to map to other properties.

How to pass that funcref/index map to the host? When I initially wrote Hoot, I didn’t want to just, you know, put the funcrefs of interet into a table and let the index of a function’s slot be the value in the key-value mapping; that would be useless memory usage. Instead, we emitted functions that took an integer, and which would return a funcref. Yes, these used br_table, and yes, there could be tens of thousands of cases, depending on what you were compiling.

Then to map the integer index to, say, a function name, likewise I didn’t want a table; that would force eager allocation of all strings. Instead I emitted a function with a br_table whose branches would return string.const values.

Except, of course, stringref didn’t become a thing, and so instead we would end up lowering to allocate string constants as globals.

Except, of course, Wasm’s idea of what a “constant” is is quite restricted, so we have a pass that moves non-constant global initializers to the “start” function. This results in an enormous start function. The straightforward solution was to partition global initializations into separate functions, called by the start function.

For the funcref debugging, the solution was more intricate: firstly, we represent the funcref-to-index mapping just as a table. It’s fine. Then for the side table mapping indices to function names and sources, we emit DWARF, and attach a special attribute to each “introspectable” function. In this way, reading the DWARF sequentially, we reconstruct a mapping from index to DWARF entry, and thus to a byte range in the Wasm code section, and thus to source information in the .debug_line section. It sounds gnarly but Guile already used DWARF as its own debugging representation; switching to emit it in Hoot was not a huge deal, and as we only need to consume the DWARF that we emit, we only needed some 400 lines of JS for the web/node run-time support code.

This switch to data instead of code removed the last really long pole from the GCC part of Wastrel’s pipeline. What’s more, Wastrel can now implement the code_name and code_source imports for Hoot programs ahead of time: it can parse the DWARF at compile-time, and generate functions that look up functions by address in a sorted array to return their names and source locations. As of today, this works!

fin

There are still a few things that Hoot wants from a host that Wastrel has stubbed out: weak refs and so on. I’ll get to this soon; my goal is a proper Scheme REPL. Today’s note is a waypoint on the journey. Until next time, happy hacking!

Thibault Martin: TIL that Sveltia is a good CMS for Astro

Mar, 31/03/2026 - 11:00pd

This website is built with the static site generator Astro. All my content is written in markdown and uploaded to a git repository. Once the content is merged into the main branch, Cloudflare deploys it publicly. The process to publish involves:

  1. Creating a new markdown file.
  2. Filling it with thoughts.
  3. Pushing it to a new branch.
  4. Waiting for CI to check my content respects some rules.
  5. Pressing the merge button.

This is pretty involved and of course requires access to a computer. This goes directly against the goal I’ve set for myself to reduce friction to publish.

I wanted a simple solution to write and publish short posts directly from mobile, without hosting an additional service.

Such an app is called a git-based headless CMS. Decap CMS is the most frequently cited solution for git-based content management, but it has two show-stoppers for me:

  1. It’s not mobile friendly (yet, since 2017) although there are community workarounds.
  2. It’s not entirely client-side. You need to host a serverless script e.g. on a Cloudflare Worker to complete authentication.

Because my website is completely static, it’s easy to take it off GitHub and Cloudflare and move it elsewhere. I want the CMS solution I choose to be purely client-side, so it doesn’t get in the way of moving elsewhere.

It turns out that Sveltia, an API-compatible and self-proclaimed successor to Decap, is a good fit for this job, with a few caveats.

Sveltia is a mobile-friendly Progressive Web App (PWA) that doesn’t require a backend. It's a static app that can be added to my static website. It has a simple configuration file to describe what fields each post expects (title, publication date, body, etc).

Once the configuration and authentication are done, I have access to a lightweight PWA that lets me create new posts.

The authentication is straightforward for technical people. I need to paste a GitHub Personal Access Token (PAT) in the login page, and that's it. Sveltia will fetch the existing content and display it.

The PWA itself is also easy to deploy: I need to add a page served under the /admin route, that imports the app. I could just import it from a third party CDN, but there’s also a npm package for it. It allows me to serve the javascript as a first party instead, all while easily staying up to date.

I installed it with

$ pnpm add @sveltia/cms

I then created an Astro page under src/pages/admin/index.astro with the following content

title="src/pages/admin/index.astro" <!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <title>Content Manager – ergaster.org</title> <script> import { init } from "@sveltia/cms"; init(); </script> </head> <body></body> </html>

I also created the config file under public/admin/config.yml with Sveltia Entry Collections matching my Astro content collections. The setup is straightforward and well documented.

Sveltia has a few caveats though:

  1. It can only work on a single branch, and not create a new branch per post. According to the maintainer, it should be possible to create new branches with “Editorial Workflowby Q2 or Q3 this year.
  2. It pushes content directly to its target branch, including drafts. I still want to run CI checks before merging my content, so I’ve created a drafts branch and configured Sveltia to push content there. Once the CI checks have passed I merge the branch manually from the GitHub mobile app.
  3. Having a single target branch also means I can only have one draft coming from Sveltia at a time. If I edited two drafts concurrently on the drafts branch, they would both be published the next time I merged drafts into main.
  4. It’s clunky to rename a picture uploaded via Sveltia.

Those are not deal breakers to me. The maintainer seems reactive, and the Editorial Workflow feature coming in Q2 or Q3 will fix the remaining clunkiness.