Read more of this story at Slashdot.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
It has been a productive, prosperous, and career-building few months—from contemplating whether to apply for the contribution stage, to submitting my application at the last minute, to not finding a Go project, then sprinting through a Rust course after five days of deliberation. Eventually, I began actively contributing to librsvg in Rust, updated a documentation section, closed a couple of issues, and was ultimately selected for the Outreachy December 2024 – March 2025 cohort as an intern for the GNOME Foundation.
It has been a glorious journey, and I thank God for His love and grace throughout the application process up to this moment as I write this blog. I would love to delve into my journey to getting accepted into Outreachy, but since this blog is about reflecting on the experience as it wraps up, let’s get to it.
Overcoming Fear and DoubtYou might think my fears began when I got accepted into the internship, but they actually started much earlier. Before even applying, I was hesitant. Then, when I got in for the contribution phase, I realized that the language I was most familiar with, Go, was not listed.I felt like I was stepping into a battlefield with thousands of applicants, and my current arsenal was irrelevant. I believed I would absolutely dominate with Go, but now I couldn’t even find a project using it!
This fear lingered even after I got accepted. I kept wondering if I was going to mess things up terribly.
It takes time to master a programming language, and even more time to contribute to a large project. I worried about whether I could make meaningful contributions and whether I would ultimately fail.
And guess what? I did not fail. I’m still here, actively contributing to librsvg, and I plan to continue working on other GNOME projects. I’m now comfortable writing Rust, and most importantly, I’ve made huge progress on my project tasks. So how did I push past my fear? I initially didn’t want to apply at all, but a lyric from Dave’s song Survivor’s Guilt stuck with me: “When you feel like givin’ up, know you’re close.” Another saying that resonated with me was, “You never know if you’ll fail or win if you don’t try.” I stopped seeing the application as a competition with others and instead embraced an open mindset: “I’ve always wanted to learn Rust, and this is a great opportunity.” “I’m not the best at communication, but maybe I can grow in that area.” Shifting my mindset from fear to opportunity helped me stay the course, and my fear of failing never materialized.
My Growth and Learning ProcessFor quite some time, I had been working exclusively with a single programming language, primarily building backend applications. However, my Outreachy internship experience opened me up to a whole new world of possibilities. Now, I program in Rust, and I have learned a lot about SVGs, the XML tree, text rendering, and much more.
My mentor has been incredibly supportive, and thanks to him, I believe I will be an excellent mentor when I find myself in a position to guide others. His approach to communication, active listening, and problem-solving has left a strong impression on me, and I’ve found myself subconsciously adopting his methods. I also picked up some useful Git tricks from him and improved my ability to visualize and break down complex problems.
I have grown in technical knowledge, soft skills, and networking—my connections within the open-source community have expanded significantly!
Project Progress and Next StepsThe project’s core algorithms are now in place, including text-gathering, whitespace handling, text formatting, attribute collection, shaping, and more. The next step is to integrate these components to implement the full SVG2 text layout algorithm.
As my Outreachy internship with GNOME comes to an end today, I want to reflect on this incredible journey and express my gratitude to everyone who made it such a rewarding experience.
I am deeply grateful to God, the Outreachy organizers, my family, my mentor Federico (GNOME co-founder), Felipe Borges, and everyone who contributed to making this journey truly special. Thank you all for an unforgettable experience.
Due to my circumstances, I might be perhaps interested in dogfooding a larger number of GNOME system/session components on a daily basis than the average.
So far, I have been using jhbuild to help me with this deed, mostly in the form of jhbuild make to selectively build projects out of their git tree. See, there’s a point in life where writing long-winded CLI commands stop making you feel smart and work the opposite way, jhbuild had a few advantages I liked:
This, combined with my habit to use Fedora Rawhide also meant I did not require to rebuild the world to get up-to-date dependencies, keeping the number of miscellaneous modules built to a minimum.
This was all true even after Silverblue came around, and Florian unlocked the “run GNOME as built from toolbox” achievement. I adopted this methodology, but still using jhbuild to build things inside that toolbox, for the sake of convenience.
Enter sysext-utilsMeanwhile, systemd sysexts came around as a way to install “extensions” to the base install, even over atomic distributions, paving a way for development of system components to happen in these distributions. More recently Martín Abente brought an excellent set of utilities to ease building such sysexts.
This is a great step in the direction towards sysexts as a developer testing method. However, there is a big drawback for users of atomic distributions: to build these sysexts you must have all necessary build dependencies in your host system. Basically, desecrating your small and lean atomic install with tens to hundreds of packages. While for GNOME OS it may be true that it comes “with batteries included”, feels like a very big margin to miss the point with Silverblue, where the base install is minimal and you are supposed to carry development with toolbox, install apps with flatpak, etc etc.
What is necessaryIdeally, in these systems, we’d want:
The most natural way to achieve both last points is building things so they install in /usr/local, as this will allow us to also mount this location from the host inside the toolbox, in order to build things that depend on our own sysexts.
And last, I want an easy way to manage these projects that does not get in the middle of things, is fast to type, etc.
Introducing ggSo I’ve made a small script to help myself on these tasks. It can be installed at ~/.local/bin along with sysext-utils, and be used in a host shell to generate, install and generally manage a number of sysexts.
Sysexts-utils is almost there for this, I however needed some local hacks to help me get by:
– Since I have these are installed at ~/.local, but they will be run with pkexec to do things as root, the python library lookup paths had to be altered in the executable scripts (sysext-utils#10).
– They are ATM somewhat implicitly prepared to always install things at /usr, I had to alter paths in code to e.g. generate GSettings schemas at the right location (sysext-utils#11).
Hopefully these will be eventually sorted out. But with this I got 1) a pristine atomic setup and 2) My tooling in ~/.local 3) all the development environment in my home dir, 4) a simple and fast way to manage a number of projects. Just most I ever wanted from jhbuild.
This tool is a hack to put things together, done mainly so it’s intuitive and easy to myself. So far been using it for a week with few regrets except the frequent password prompts. If you think it’s useful for you too, you’re welcome.
Earlier this week I took an inventory of how Guile uses the Boehm-Demers-Weiser (BDW) garbage collector, with the goal of making sure that I had replacements for all uses lined up in Whippet. I categorized the uses into seven broad categories, and I was mostly satisfied that I have replacements for all except the last: I didn’t know what to do with untagged allocations: those that contain arbitrary data, possibly full of pointers to other objects, and which don’t have a header that we can use to inspect on their type.
But now I do! Today’s note is about how we can support untagged allocations of a few different kinds in Whippet’s mostly-marking collector.
inside and outsideWhy bother supporting untagged allocations at all? Well, if I had my way, I wouldn’t; I would just slog through Guile and fix all uses to be tagged. There are only a finite number of use sites and I could get to them all in a month or so.
The problem comes for uses of scm_gc_malloc from outside libguile itself, in C extensions and embedding programs. These users are loathe to adapt to any kind of change, and garbage-collection-related changes are the worst. So, somehow, we need to support these users if we are not to break the Guile community.
on intentThe problem with scm_gc_malloc, though, is that it is missing an expression of intent, notably as regards tagging. You can use it to allocate an object that has a tag and thus can be traced precisely, or you can use it to allocate, well, anything else. I think we will have to add an API for the tagged case and assume that anything that goes through scm_gc_malloc is requesting an untagged, conservatively-scanned block of memory. Similarly for scm_gc_malloc_pointerless: you could be allocating a tagged object that happens to not contain pointers, or you could be allocating an untagged array of whatever. A new API is needed there too for pointerless untagged allocations.
on dataRecall that the mostly-marking collector can be built in a number of different ways: it can support conservative and/or precise roots, it can trace the heap precisely or conservatively, it can be generational or not, and the collector can use multiple threads during pauses or not. Consider a basic configuration with precise roots. You can make tagged pointerless allocations just fine: the trace function for that tag is just trivial. You would like to extend the collector with the ability to make untagged pointerless allocations, for raw data. How to do this?
Consider first that when the collector goes to trace an object, it can’t use bits inside the object to discriminate between the tagged and untagged cases. Fortunately though the main space of the mostly-marking collector has one metadata byte for each 16 bytes of payload. Of those 8 bits, 3 are used for the mark (five different states, allowing for future concurrent tracing), two for the precise field-logging write barrier, one to indicate whether the object is pinned or not, and one to indicate the end of the object, so that we can determine object bounds just by scanning the metadata byte array. That leaves 1 bit, and we can use it to indicate untagged pointerless allocations. Hooray!
However there is a wrinkle: when Whippet decides the it should evacuate an object, it tracks the evacuation state in the object itself; the embedder has to provide an implementation of a little state machine, allowing the collector to detect whether an object is forwarded or not, to claim an object for forwarding, to commit a forwarding pointer, and so on. We can’t do that for raw data, because all bit states belong to the object, not the collector or the embedder. So, we have to set the “pinned” bit on the object, indicating that these objects can’t move.
We could in theory manage the forwarding state in the metadata byte, but we don’t have the bits to do that currently; maybe some day. For now, untagged pointerless allocations are pinned.
on slopYou might also want to support untagged allocations that contain pointers to other GC-managed objects. In this case you would want these untagged allocations to be scanned conservatively. We can do this, but if we do, it will pin all objects.
Thing is, conservative stack roots is a kind of a sweet spot in language run-time design. You get to avoid constraining your compiler, you avoid a class of bugs related to rooting, but you can still support compaction of the heap.
How is this, you ask? Well, consider that you can move any object for which we can precisely enumerate the incoming references. This is trivially the case for precise roots and precise tracing. For conservative roots, we don’t know whether a given edge is really an object reference or not, so we have to conservatively avoid moving those objects. But once you are done tracing conservative edges, any live object that hasn’t yet been traced is fair game for evacuation, because none of its predecessors have yet been visited.
But once you add conservatively-traced objects back into the mix, you don’t know when you are done tracing conservative edges; you could always discover another conservatively-traced object later in the trace, so you have to pin everything.
The good news, though, is that we have gained an easier migration path. I can now shove Whippet into Guile and get it running even before I have removed untagged allocations. Once I have done so, I will be able to allow for compaction / evacuation; things only get better from here.
Also as a side benefit, the mostly-marking collector’s heap-conservative configurations are now faster, because we have metadata attached to objects which allows tracing to skip known-pointerless objects. This regains an optimization that BDW has long had via its GC_malloc_atomic, used in Guile since time out of mind.
finWith support for untagged allocations, I think I am finally ready to start getting Whippet into Guile itself. Happy hacking, and see you on the other side!
Salutations, populations. Today’s note is more of a work-in-progress than usual; I have been finally starting to look at getting Whippet into Guile, and there are some open questions.
inventoryI started by taking a look at how Guile uses the Boehm-Demers-Weiser collector‘s API, to make sure I had all my bases covered for an eventual switch to something that was not BDW. I think I have a good overview now, and have divided the parts of BDW-GC used by Guile into seven categories.
implicit usesFirstly there are the ways in which Guile’s run-time and compiler depend on BDW-GC’s behavior, without actually using BDW-GC’s API. By this I mean principally that we assume that any reference to a GC-managed object from any thread’s stack will keep that object alive. The same goes for references originating in global variables, or static data segments more generally. Additionally, we rely on GC objects not to move: references to GC-managed objects in registers or stacks are valid across a GC boundary, even if those references are outside the GC-traced graph: all objects are pinned.
Some of these “uses” are internal to Guile’s implementation itself, and thus amenable to being changed, albeit with some effort. However some escape into the wild via Guile’s API, or, as in this case, as implicit behaviors; these are hard to change or evolve, which is why I am putting my hopes on Whippet’s mostly-marking collector, which allows for conservative roots.
defensive usesThen there are the uses of BDW-GC’s API, not to accomplish a task, but to protect the mutator from the collector: GC_call_with_alloc_lock, explicitly enabling or disabling GC, calls to sigmask that take BDW-GC’s use of POSIX signals into account, and so on. BDW-GC can stop any thread at any time, between any two instructions; for most users is anodyne, but if ever you use weak references, things start to get really gnarly.
Of course a new collector would have its own constraints, but switching to cooperative instead of pre-emptive safepoints would be a welcome relief from this mess. On the other hand, we will require client code to explicitly mark their threads as inactive during calls in more cases, to ensure that all threads can promptly reach safepoints at all times. Swings and roundabouts?
precise tracingDid you know that the Boehm collector allows for precise tracing? It does! It’s slow and truly gnarly, but when you need precision, precise tracing nice to have. (This is the GC_new_kind interface.) Guile uses it to mark Scheme stacks, allowing it to avoid treating unboxed locals as roots. When it loads compiled files, Guile also adds some sliced of the mapped files to the root set. These interfaces will need to change a bit in a switch to Whippet but are ultimately internal, so that’s fine.
What is not fine is that Guile allows C users to hook into precise tracing, notably via scm_smob_set_mark. This is not only the wrong interface, not allowing for copying collection, but these functions are just truly gnarly. I don’t know know what to do with them yet; are our external users ready to forgo this interface entirely? We have been working on them over time, but I am not sure.
reachabilityWeak references, weak maps of various kinds: the implementation of these in terms of BDW’s API is incredibly gnarly and ultimately unsatisfying. We will be able to replace all of these with ephemerons and tables of ephemerons, which are natively supported by Whippet. The same goes with finalizers.
The same goes for constructs built on top of finalizers, such as guardians; we’ll get to reimplement these on top of nice Whippet-supplied primitives. Whippet allows for resuscitation of finalized objects, so all is good here.
miscThere is a long list of miscellanea: the interfaces to explicitly trigger GC, to get statistics, to control the number of marker threads, to initialize the GC; these will change, but all uses are internal, making it not a terribly big deal.
I should mention one API concern, which is that BDW’s state is all implicit. For example, when you go to allocate, you don’t pass the API a handle which you have obtained for your thread, and which might hold some thread-local freelists; BDW will instead load thread-local variables in its API. That’s not as efficient as it could be and Whippet goes the explicit route, so there is some additional plumbing to do.
Finally I should mention the true miscellaneous BDW-GC function: GC_free. Guile exposes it via an API, scm_gc_free. It was already vestigial and we should just remove it, as it has no sensible semantics or implementation.
allocationThat brings me to what I wanted to write about today, but am going to have to finish tomorrow: the actual allocation routines. BDW-GC provides two, essentially: GC_malloc and GC_malloc_atomic. The difference is that “atomic” allocations don’t refer to other GC-managed objects, and as such are well-suited to raw data. Otherwise you can think of atomic allocations as a pure optimization, given that BDW-GC mostly traces conservatively anyway.
From the perspective of a user of BDW-GC looking to switch away, there are two broad categories of allocations, tagged and untagged.
Tagged objects have attached metadata bits allowing their type to be inspected by the user later on. This is the happy path! We’ll be able to write a gc_trace_object function that takes any object, does a switch on, say, some bits in the first word, dispatching to type-specific tracing code. As long as the object is sufficiently initialized by the time the next safepoint comes around, we’re good, and given cooperative safepoints, the compiler should be able to ensure this invariant.
Then there are untagged allocations. Generally speaking, these are of two kinds: temporary and auxiliary. An example of a temporary allocation would be growable storage used by a C run-time routine, perhaps as an unbounded-sized alternative to alloca. Guile uses these a fair amount, as they compose well with non-local control flow as occurring for example in exception handling.
An auxiliary allocation on the other hand might be a data structure only referred to by the internals of a tagged object, but which itself never escapes to Scheme, so you never need to inquire about its type; it’s convenient to have the lifetimes of these values managed by the GC, and when desired to have the GC automatically trace their contents. Some of these should just be folded into the allocations of the tagged objects themselves, to avoid pointer-chasing. Others are harder to change, notably for mutable objects. And the trouble is that for external users of scm_gc_malloc, I fear that we won’t be able to migrate them over, as we don’t know whether they are making tagged mallocs or not.
what is to be done?One conventional way to handle untagged allocations is to manage to fit your data into other tagged data structures; V8 does this in many places with instances of FixedArray, for example, and Guile should do more of this. Otherwise, you make new tagged data types. In either case, all auxiliary data should be tagged.
I think there may be an alternative, which would be just to support the equivalent of untagged GC_malloc and GC_malloc_atomic; but for that, I am out of time today, so type at y’all tomorrow. Happy hacking!
So often I come across the need to avoid my system to block forever, or until a process finishes, I can’t recall how did I came across systemd inhibit, but here’s my approach and a bit of motivation
MotivationI noticed that the Gnome Settings, come with Rygel
After some fiddling (not much really), it starts directly once I login and I will be using it instead of a fully fledged plex or the like, I just want to stream some videos from time to time from my home pc over my ipad :D using VLC.
The Hack systemd-inhibit --who=foursixnine --why="maybe there be dragons" --mode block \ bash -c 'while $(systemctl --user is-active -q rygel.service); do sleep 1h; done'One can also use waitpid and more.
Thank you for comming to my ted talk.
My current playlist is this diorama of Lulu the Piggy channeling Tupac Shakur in a toy vending machine in the basement of New World Mall in Flushing Chinatown.
A concerned nutritional epidemiologist in Tokyo realizes that if you are what you eat, that means…
It’s a similar situation in Seoul, albeit with less oil and more confidence.
Last week I was bitten by a interesting C feature. The following terminate function was expected to exit if okay was zero (false) however it exited when zero was passed to it. The reason is the missing semicolon after the return function.
The interesting part this that is compiles fine because the void function terminate is allowed to return the void return value, in this case the void return from exit().