You are here

Bits from Debian

Subscribe to Feed Bits from Debian
Planet Debian - https://planet.debian.org/
Përditësimi: 3 ditë 1 orë më parë

Kai-Chung Yan: Attending FOSDEM 2016: First Touch in Open Source Community

Pre, 23/08/2019 - 8:05pd

FOSDEM 2016 happened at the end of January, but I have been too busy to write my first trip to an open source event.

FOSDEM takes place in Belgium, which is almost ten thousand kilometers from my home. Luckily, Google kindly offered sponsorship for traveling to Belgium and lodging places for former GSoC mentors and students in Debian, which made my travel possible without giving my dad headaches. Thank you Google!

Open source meetings are really fun. Imagine you have been working hard on an exciting project with several colleagues around the world who have never met you, and now you have a chance to meet them and make friends with them, cool! However I am not involved with any project too deeply, so I don’t have too much expectations on this. But I’m still excited when I first saw my mentor Hans-Christoph Steiner! Pity that we forgot to take a picture, as I’m not those kind of people who like to take selfies every day.

One of the most interesting projects I saw during FOSDEM is Ring. Ring is a distributed communication software without central servers. All Ring clients in the world are connected to several others and find a particular user using a distributed hashtable. A Ring client is a key pair, whose public key serves as the ID. Thus, Ring is anti-censorshiping, anti-eavesdropping, which is great for China citizens and feared by the China government. After I got home I knew another similar but older project Tox, which seems to more feature-rich than Ring but still not sufficient for promoting it. There’s a huge disadvantage of both project, which is high battery drainage on Android. Hope someday they will improve it.

At the end of FOSDEM I joined the volunteers to do the clean up. We cleaned all the buildings, restored the rooms and finally shared the dinner at the hall of K Building. I’m not a European so I didn’t talk too much to them, but this is really an unforgettable experience. Hope I can join the next FOSDEM soon.

Kai-Chung Yan: Introducing Gradle 1.12 in Debian

Pre, 23/08/2019 - 8:05pd

After 5 weeks of work, my colleague Komal Sukhani and I succeeded in bringing Gradle 1.12 with other packages into Debian. Here is a brief note of what we’ve done:

Note that both Gradle and Groovy are in experimental distribution because Groovy build-depends on Gradle, and Gradle build-depends on bnd 2.1.0, which is in experimental as well.

Updating these packages takes us an entire month because my summer vacation had not come yet until the day we uploaded Gradle and Groovy, which means we were doing the job in our spare time (Sukhani finished her semester at the beginning though).

Next step is to update Gradle to 2.4 as soon as possible because Sukhani has started her work on the Java part of Android SDK, which requires Gradle 2.2 or above. Before updating Gradle I need to package the Java SDK for AWS, which enables Gradle to access S3 resources. I also need to make gradle-1.12 as a separate package and use it to build gradle_2.4-1.

After that, I will start my work on the C/C++ part of Android SDK, which is far more complicated and messy than I had expected. Yet I enjoy the summer coding. Happy coding, all open source developers!

Finally, feel free to check out my weekly report in Debian’s mailing list:

Kai-Chung Yan: Google Summer of Code Started: Packaging Android SDK for Debian

Pre, 23/08/2019 - 8:04pd

And here it is: I am accepted as a GSoC 2015 student! Actually this has been a while since the result was out in the end of April. When I was applying for this GSoC, I never expected I could be accepted.

So what is Google Summer of Code, in case someone hasn’t heard about it at all? Google Summer of Code is an annual activity hosted by Google which gathers college students around the world to contribute to open source softwares. Every year hundreds of open source organizations join GSoC to provide project ideas and mentors, and thousands of students apply to and choose a project and work on it during the summer, and get well paid by Google if they manage to finish the task. This year we have 1051 students accepted with 49 from China and 2 from Taiwan. You can read more details from this post.

Although it says so from Geography textbooks and my Geography teacher, I had been not believing that India is a software giant, until I saw that India has the most students accepted and my partner on this project is a girl from India!

Project Details

The project we will work on this summer is to package Android SDK into Debian. In addition to that, we wil also update the existing packages that is essential to Android development, e.g. Gradle. Although some may say this project is not quite complicated, it still has lots of work to do, which makes it a rather large project that has two students working on it and a co-mentor. My primary mentor is Hans-Christoph Steiner from The Guardian Project and he also wrote a post about the project.

Why do we need to do this? There are reasons on security, convenience and ideal, but the biggest one for me is that if you use Linux and you write Android apps, or perhaps you are just ready to flash your device a CyanogenMod, there will be no better way than to just type sudo aptitude install adb. More infomation on this project can be found on Debian’s Android Tools Team page.

Problems We Are Facing

Currently (mid May) the offical beginning of coding phase has not yet arrived, but we have made a meeting on IRC and confirmed the largest problems we have so far.

The first problem is the packaging of Gradle. Gradle is a rather new and innovating build automation system, with which most Android apps and the Android SDK tools written in Java are built. It is a building system, so unsurprisingly it is built with itself. In this case, updating Gradle is much harder. Currently Gradle is version 2.4 but the one in Debian is 1.5. In the worst cases, we have to build all versions of Gradle from 1.6 to 2.4 one by one due to its self-dependency.

In reality, building a project with Gradle is way more easier and happier than any other build system because it handles the dependency in a brilliant way by downloading everything it needs, including Gradle itself. Thus it does not matter if you have installed Gradle or even if you are using Linux or Windows. However when building the Debian package, it seems that we have to abandoned the convenience and make it totally offline and rely only on the things in Debian. This is for security and reproducibility but the packaging will be much more complicated since we have to modify lots of code in the build scripts from upstream source. Also in such case, since the building is restricted to rely on the existing things in a Debian system, quite a few plugins that uses softwares that isn’t in Debian yet will be excluded from the Debian version of Gradle, which makes it less usable than simply launching the Gradle wrapper. In that case, I suppose there will be very few people really using the Gradle in Debian repository.

The second problem is how to determine which Git commit we should checkout from the Android SDK repository to build a particular version of the tools. Android SDK does not release its source code in tarball form, so we have to deal with the Git repository. What’s worse, the tools in Android SDK come from different repositories, and they have almost no infomation on the tools’ version number at all. We can’t confirm which commit or tag or branch in the repository corresponds to a particular version. And what’s way worse, Android SDK has 3 parts being SDK-tools, Build-tools and Platform-tools, each of which has defferent version numbers! And what’s way way worse, I have posted the question to various places and no one had answered me.

After our IRC discussion, we have been focusing on Gradle. I am still reading documentations about Debian packaging and using Gradle. All I hope now is that we can finish the project nice and fast and no pity will be left in this summer. Also I hope my GSoC T-shirt will be delivered to my home as soon as possible, it’s really cool!

Do You Want to Join GSoC as Well?

Surprisingly, most students in my school haven’t heard about Google Summer of Code at all, that is why there are only 2 accepted students from Taiwan. But if you know it and you study computer science (or in other ridiculous department related to computer science just like mine), do not hesitate and join the next year’s! Contribute to open source, and get highly paid (5500 USD this year), is it not really cool? Here I am offering you several tips.

Before I applied my proposal, I saw a guy from KDE wrote some tips with a shocking title. Reading that is enough I guess, but I still need to list some points:

  • Contact your potantial mentors even before you are writing your proposal, that really helps.
  • Remember to include a rough schedule in your proposal, it is very important.
  • Be interative to your mentor, ask good questions often.

Have fun in the summer!

Holger Levsen: 20190823-cccamp

Pre, 23/08/2019 - 2:38pd
Dialing 8874 on the local GSM and DECT networks

Dialing 8874 on the local GSM and DECT networks currently (it's 2:30 in the morning) let's you hear this automatic announcement: "The current temperature of the pool is 36.2 degrees" and said pool is like 15m away, temporarily built beneath a forest illuminated with disco balls...

I <3 cccamp.

Dirk Eddelbuettel: Rcpp now used by 1750 CRAN packages

Enj, 22/08/2019 - 2:58pd

Since this morning, Rcpp stands at just over 1750 reverse-dependencies on CRAN. The graph on the left depicts the growth of Rcpp usage (as measured by Depends, Imports and LinkingTo, but excluding Suggests) over time.

Rcpp was first released in November 2008. It probably cleared 50 packages around three years later in December 2011, 100 packages in January 2013, 200 packages in April 2014, and 300 packages in November 2014. It passed 400 packages in June 2015 (when I tweeted about it), 500 packages in late October 2015, 600 packages in March 2016, 700 packages last July 2016, 800 packages last October 2016, 900 packages early January 2017,
1000 packages in April 2017, 1250 packages in November 2017, and 1500 packages in November 2018. The chart extends to the very beginning via manually compiled data from CRANberries and checked with crandb. The next part uses manually saved entries. The core (and by far largest) part of the data set was generated semi-automatically via a short script appending updates to a small file-based backend. A list of packages using Rcpp is availble too.

Also displayed in the graph is the relative proportion of CRAN packages using Rcpp. The four per-cent hurdle was cleared just before useR! 2014 where I showed a similar graph (as two distinct graphs) in my invited talk. We passed five percent in December of 2014, six percent July of 2015, seven percent just before Christmas 2015, eight percent last summer, nine percent mid-December 2016, cracked ten percent in the summer of 2017 and eleven percent in 2018. We are currently at 11.83 percent: a little over one in nine packages. There is more detail in the chart: how CRAN seems to be pushing back more and removing more aggressively (which my CRANberries tracks but not in as much detail as it could), how the growth of Rcpp seems to be slowing somewhat outright and even more so as a proportion of CRAN – just like one would expect a growth curve to.

1750+ user packages is pretty mind-boggling. We can use the progression of CRAN itself compiled by Henrik in a series of posts and emails to the main development mailing list. Not that long ago CRAN itself did not have 1500 packages, and here we are at almost 14810 with Rcpp at 11.84% and still growing (though maybe more slowly). Amazeballs.

The Rcpp team continues to aim for keeping Rcpp as performant and reliable as it has been. A really big shoutout and Thank You! to all users and contributors of Rcpp for help, suggestions, bug reports, documentation or, of course, code.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Joey Hess: releasing two haskell libraries in one day: libmodbus and git-lfs

Mër, 21/08/2019 - 6:28md

The first library is a libmodbus binding in haskell.

There are a couple of other haskell modbus libraries, but none that support serial communication out of the box. I've been using a python library to talk to my solar charge controller, but it is not great at dealing with the slightly flakey interface. The libmodbus C library has features that make it more robust, and it also supports fast batched reads.

So a haskell interface to it seemed worth starting while I was doing laundry, and then for some reason it seemed worth writing a whole bunch more FFIs that I may never use, so it covers libmodbus fairly extensively. 660 lines of code all told.

Writing a good binding to a C library has art to it. I've seen ones that are so close you feel you're writing C and not haskell. On the other hand, some are so far removed from the underlying library that its documentation does not carry over at all.

I tried to strike a balance. Same function names so the extensive libmodbus documentation is easy to refer to while using it, but plenty of haskell data types so you won't mix up the parity with the stop bits.

And while it uses a mutable vector under the hood as the buffer for the FFI interface, so it can be just as fast as the C library, I also made functions for reading stuff like registers and coils be polymorphic so easier data types can be used at the expense of a bit of extra allocation.

The big win in this haskell binding is that you can leverage all the nice haskell libraries for dealing with binary data to parse the modbus data, rather than the ad-hoc integer and float conversion stuff from the C library.

For example, the Epever solar charge controller has its own slightly nonstandard way to represent 16 bit and 32 bit floats. Using the binary library to parse its registers in applicative style came out quite nice:

data Epever = Epever { pv_array_voltage :: Float , pv_array_current :: Float , pv_array_power :: Float , battery_voltage :: Float } deriving (Show) getEpever :: Get Epever getEpever = Epever <$> epeverfloat -- register 0x3100 <*> epeverfloat -- register 0x3101 <*> epeverfloat2 -- register 0x3102 (low) and 0x3103 (high) <*> epeverfloat -- register 0x3104 where epeverfloat = decimals 2 <$> getWord16host epeverfloat2 = do l <- getWord16host h <- getWord16host return (decimals 2 (l + h*2^16)) decimals n v = fromIntegral v / (10^n)

The second library is a git-lfs implementation in pure Haskell.

Emphasis on the pure -- there is not a scrap of IO code in this library, just 400+ lines of data types, parsing, and serialization.

I wrote it a couple weeks ago so git-annex can store files in a git-lfs remote. I've also used it as a git-lfs server, mostly while exploring interesting edge cases of git-lfs.

This work was sponsored by Jake Vosloo on Patreon.

Russ Allbery: Review: Trail of Lightning

Mër, 21/08/2019 - 5:28pd

Review: Trail of Lightning, by Rebecca Roanhorse

Series: The Sixth World #1 Publisher: Saga Copyright: 2018 ISBN: 1-5344-1351-0 Format: Kindle Pages: 286

Maggie Hoskie is a monster hunter. Trained and then inexplicably abandoned by Neizghání, an immortal monster-slayer of her people, the Diné (Navajo), she's convinced that she's half-monster herself. Given that she's the sort of monster hunter who also kills victims that she thinks may be turned into monsters themselves, she may have a point. Apart from contracts to kill things, she stays away from nearly everyone except Tah, a medicine man and nearly her only friend.

The monster that she kills at the start of the book is a sign of a larger problem. Tah says that it was created by someone else using witchcraft. Maggie isn't thrilled at the idea of going after the creator alone, given that witchcraft is what Neizghání rescued her from in an event that takes Maggie most of the book to be willing to describe. Tah's solution is a partner: Tah's grandson Kai, a handsome man with a gift for persuasion who has never hunted a monster before.

If you've read any urban fantasy, you have a pretty good idea of where the story goes from there, and that's a problem. The hair-trigger, haunted kick-ass woman with a dark past, the rising threat of monsters, the protagonist's fear that she's a monster herself, and the growing romance with someone who will accept her is old, old territory. I've read versions of this from Laurell K. Hamilton twenty-five years ago to S.L. Huang's ongoing Cas Russell series. To stand out in this very crowded field, a series needs some new twist. Roanhorse's is the deep grounding in Native American culture and mythology. It worked well enough for many people to make it a Hugo, Nebula, and World Fantasy nominee. It didn't work for me.

I partly blame a throw-away line in Mike Kozlowski's review of this book for getting my hopes up. He said in a parenthetical note that "the book is set in Dinétah, a Navajo nation post-apocalyptically resurgent." That sounded great to me; I'd love to read about what sort of society the Diné might build if given the opportunity following an environmental collapse. Unfortunately, there's nothing resurgent about Maggie's community or people in this book. They seem just as poor and nearly as screwed as they are in our world; everyone else has just been knocked down even farther (or killed) and is kept at bay by magical walls. There's no rebuilding of civilization here, just isolated settlements desperate for water, plagued by local warlords and gangs, and facing the added misery of supernatural threats. It's bleak, cruel, and unremittingly hot, which does not make for enjoyable reading.

What Roanhorse does do is make extensive use of Native American mythology to shape the magic system, creatures, and supernatural world view of the book. This is great. We need a wider variety of magic systems in fantasy, and drawing on mythological systems other than Celtic, Greek, Roman, and Norse is a good start. (Roanhorse herself is Ohkay Owingeh Pueblo, not Navajo, but I assume without any personal knowledge that her research here is reasonably good.) But, that said, the way the mythology plays out in this book didn't work for me. It felt scattered and disconnected, and therefore arbitrary.

Some of the difficulty here is inherent in the combination of my unfamiliarity and the challenge of adopting real-world mythological systems for stories. As an SFF reader, one of the things I like from the world-building is structure. I like seeing how the pieces of the magical system fit together to build a coherent set of rules, and how the protagonists manipulate those rules in the story. Real-world traditions are rarely that neat and tidy. If the reader is already familiar with the tradition, they can fill in a lot of the untold back story that makes the mythology feel more coherent. If the author cannot assume that knowledge, they can get stuck between simplifying and restructuring the mythology for easy understanding or showing only scattered and apparently incoherent pieces of a vast system. I think the complaints about the distorted and simplified version of Celtic mythology in a lot of fantasy novels from those familiar with the real thing is the flip-side to this problem; it's worse mythology, but it may be more approachable storytelling.

I'm sure it didn't help that one of the most important mythological figures of this book is Coyote, a trickster god. I have great intellectual appreciation for the role of trickster gods in mythological systems, but this is yet more evidence that I rarely get along with them in stories. Coyote in this story is less of an unreliable friend and more of a straight-up asshole who was not fun to read about.

That brings me to my largest complaint about this novel: I liked exactly one person in the entire story. Grace, the fortified bar owner, is great and I would have happily read a book about her. Everyone else, including Maggie, ranged from irritating to unbearably obnoxious. I was saying the eight deadly words ("I don't care what happens to these people") by page 100.

Here, tastes will differ. Maggie acts the way that she does because she's sitting on a powder keg of unprocessed emotional injury from abuse, made far worse by Neizghání's supposed "friendship." It's realistic that she shuts down, refuses to have meaningful conversations, and lashes out at everyone on a hair trigger. I felt sympathy, but I didn't like her, and liking her is important when the book is written in very immediate present-tense first person. Kai is better, but he's a bit too much of a stereotype, and I have an aversion to supposedly-charming men. I think some of the other characters could have been good if given enough space (Tah, for instance), but Maggie's endless loop of self-hatred doesn't give them any room to breathe.

Add on what I thought were structural and mechanical flaws (the first-person narration is weirdly specific and detail-oriented in a way that felt like first-novel mechanical problems, and the ending is one of the least satisfying and most frustrating endings I have ever read in a book of this sort) and I just didn't like this. Clearly there are a lot of people nominating and voting for awards who think I'm wrong, so your mileage may vary. But I thought it was unoriginal except for the mythology, unsatisfying in the mythology, and full of unlikable characters and unpleasant plot developments. I'm unlikely to read more in this series.

Followed by Storm of Locusts.

Rating: 4 out of 10

Philipp Kern: Alpha: Self-service buildd givebacks

Mër, 21/08/2019 - 12:54pd
Builds on Debian's build farm sometimes fail transiently. Sometimes those failures are legitimate flakes, for instance when an in-progress build happens to exhaust its resources because of other builds on the same machine. Until now, you always needed to mail the buildd, wanna-build admins or the Release Team directly in order to get the builds re-queued.

As an alpha trial I implemented self-service givebacks as a web script. As SSO for Debian developers is now a thing, it is trivial to add authentication in a way that a role account can use to act on your behalf. While at work this would all be an RPC service, I figured that a little CGI script would do the job just as well. So lo and behold, accessing
https://buildd.debian.org/auth/giveback.cgi?pkg=<package>&suite=<suite>&arch=<arch> with the right parameters set:

You are authenticated as pkern. ✓ Working on package fife, suite sid and architecture mipsel. ✓ Package version 0.4.2-1 in state Build-Attempted, can be given back. ✓ Successfully given back the package. ✓
Note that you need to be a Debian developer with a valid SSO client certificate to access this service.

So why do I say alpha? We still expect Debian developers to act responsibly when looking at build failures. A lot of times there is a legitimate bug in the package and the last thing we would like to see as a project is someone addressing flakiness by continuously retrying a build. Access to this service is logged. Most people coming to us today did their due diligence and tried reproducing the issue on a porterbox. We still expect these things to happen but this aims to cut on the round-trip time until an admin gets around to process your request, which have been longer than necessary recently. We will audit the logs and see if particular packages stand out.

There can also still be bugs. Please file them against buildd.debian.org when you see them. Please include a copy of the output, which includes validation and important debugging information when requests are rejected. Also this all only works for packages in Build-Attempted. If the build has been marked as Failed (which is a manual process), you still need to mail us. And lastly the API can still change. Luckily the state change can only happen once, so it's not much of a problem for the GET request to be retried. But it should likely move to POST anyhow. In that case I will update this post to reflect the new behavior.

Thanks to DSA for making sure that I run the service sensibly using a dedicated role account as well as WSGI and doing the work to set up the necessary bits.

Bits from Debian: salsa.debian.org: Postmortem of failed Docker registry move

Mar, 20/08/2019 - 1:20md

The Salsa admin team provides the following report about the failed migration of the Docker container registry. The Docker container registry stores Docker images, which are for example used in the Salsa CI toolset. This migration would have moved all data off to Google Cloud Storage (GCS) and would have lowered the used file system space on Debian systems significantly.

The Docker container registry is part of the Docker distribution toolset. This system supports multiple backends for file storage: local, Amazon Simple Storage Service (Amazon S3) and Google Cloud Storage (GCS). As Salsa already uses GCS for data storage, the Salsa admin team decided to move all the Docker registry data off to GCS too.

Migration and rollback

On 2019-08-06 the migration process was started. The migration itself went fine, although it took a bit longer than anticipated. However, as not all parts of the migration had been properly tested, a test of the garbage collection triggered a bug in the software.

On 2019-08-10 the Salsa admins started to see problems with garbage collection. The job running it timed out after one hour. Within this timeframe it not even managed to collect information about all used layers to see what it can cleanup. A source code analysis showed that this design flaw can't be fixed.

On 2019-08-13 the change was rolled back to storing data on the file system.

Docker registry data storage

The Docker registry stores all of the data sans indexing or reverse references in a file system-like structure comprised of 4 separate types of information: Manifests of images and contents, tags for the manifests, deduplicaed layers (or blobs) which store the actual data, and lastly links which show which deduplicated blogs belong to their respective images, all of this does not allow for easy searching within the data.

The file system structure is built as append-only which allows for adding blobs and manifests, addition, modification, or deletion of tags. However cleanup of items other than tags is not achievable within the maintenance tools.

There is a garbage collection process which can be used to clean up unreferenced blobs, however according to the documentation the process can only be used while the registry is set to read-only and unfortunately it cannot be used to clean up unused links.

Docker registry garbage collection on external storage

For the garbage collection the registry tool needs to read a lot of information as there is no indexing of the data. The tool connects to the storage medium and proceeds to download … everything, every single manifest and information about the referenced blobs, which now takes up over 1 second to process a single manifest. This process will take up a significant amount of time, which in the current configuration of external storage would make the clean up nearly impossible.

Leasons learned

The Docker registry is a data storage tool that can only properly be used in append-only mode. If you never cleanup, it works well.

As soon as you want to actually remove data, it goes bad. For Salsa clean up of old data is actually a necessity, as the registry currently grows about 20GB per day.

Next steps

Sadly there is not much that can be done using the existing Docker container registry. Maybe GitLab or someone else would like to contribute a new implementation of a Docker registry, either integrated into GitLab itself or stand-alone?

Rapha&#235;l Hertzog: Promoting Debian LTS with stickers, flyers and a video

Mar, 20/08/2019 - 12:45md

With the agreement of the Debian LTS contributors funded by Freexian, earlier this year I decided to spend some Freexian money on marketing: we sponsored DebConf 19 as a bronze sponsor and we prepared some stickers and flyers to give out during the event.

The stickers only promote the Debian LTS project with the semi-official logo we have been using and a link to the wiki page. You can see them on the back of a laptop in the picture below. As you can see, we have made two variants with different background colors:

The flyers and the video are meant to introduce the Debian LTS project and to convince companies to sponsor the Debian LTS project through the Freexian offer. Those are short documents and they can’t explain the precise relationship between Debian LTS and Freexian. We try to show that Freexian is just an intermediary between contributors and companies, but some persons will still have the feeling that a commercial entity is organizing Debian LTS.

Check out the video on YouTube:

The inside of the flyer looks like this:

Click on the picture to see it full size

Note that due to some delivery issues, we have left-over flyers and stickers. If you want some to give out during a free software event, feel free to reach out to me.

No comment | Liked this article? Click here. | My blog is Flattr-enabled.

Rapha&#235;l Hertzog: Freexian’s report about Debian Long Term Support, July 2019

Mar, 20/08/2019 - 11:38pd

Like each month, here comes a report about the work of paid contributors to Debian LTS.

Individual reports

In July, 199 work hours have been dispatched among 13 paid contributors. Their reports are available:

  • Adrian Bunk got 8h assigned but did nothing (plus 10 extra hours from June), thus he is carrying over 18h to August.
  • Ben Hutchings did 18.5 hours (out of 18.5 hours allocated).
  • Brian May did 10 hours (out of 10 hours allocated).
  • Chris Lamb did 18 hours (out of 18 hours allocated).
  • Emilio Pozuelo Monfort did 21 hours (out of 18.5h allocated + 17h remaining, thus keeping 14.5 extra hours for August).
  • Hugo Lefeuvre did 9.75 hours (out of 18.5 hours, thus carrying over 8.75h to Augustq).
  • Jonas Meurer did 19 hours (out of 17 hours allocated plus 2h extra hours June).
  • Markus Koschany did 18.5 hours (out of 18.5 hours allocated).
  • Mike Gabriel did 15.75 hours (out of 18.5 hours allocated plus 7.25 extra hours from June, thus carrying over 10h to August.).
  • Ola Lundqvist did 0.5 hours (out of 8 hours allocated plus 8 extra hours from June, then he gave 7.5h back to the pool, thus he is carrying over 8 extra hours to August).
  • Roberto C. Sanchez did 8 hours (out of 8 hours allocated).
  • Sylvain Beucler did 18.5 hours (out of 18.5 hours allocated).
  • Thorsten Alteholz did 18.5 hours (out of 18.5 hours allocated).
Evolution of the situation

July was different than other months. First, some people have been on actual vacations, while 4 of the above 14 contributors met in Curitiba, Brazil, for DebConf19. There, a talk about LTS (slides, video) was given, followed by a Q&ligA session. Also a new promotional video about Debian LTS, aimed at potential sponsors was shown there for the first time.

DebConf19 was also a success in respect to on-boarding of new contributors, we’ve found three potential new contributors, one of them is already in training.

The security tracker (now for oldoldstable as Buster has been released and thus Jessie became oldoldstable) currently lists 51 packages with a known CVE and the dla-needed.txt file has 35 packages needing an update.

Thanks to our sponsors

New sponsors are in bold.

No comment | Liked this article? Click here. | My blog is Flattr-enabled.

Dirk Eddelbuettel: RcppQuantuccia 0.0.3

Mar, 20/08/2019 - 2:45pd

A maintenance release of RcppQuantuccia arrived on CRAN earlier today.

RcppQuantuccia brings the Quantuccia header-only subset / variant of QuantLib to R. At the current stage, it mostly offers date and calendaring functions.

This release was triggered by some work CRAN is doing on updating C++ standards for code in the repository. Notably, under C++11 some constructs such ptr_fun, bind1st, bind2nd, … are now deprecated, and CRAN prefers the code base to not issue such warnings (as e.g. now seen under clang++-9). So we updated the corresponding code in a good dozen or so places to the (more current and compliant) code from QuantLib itself.

We also took this opportunity to significantly reduce the footprint of the sources and the installed shared library of RcppQuantuccia. One (unexported) feature was pricing models via Brownian Bridges based on quasi-random Sobol sequences. But the main source file for these sequences comes in at several megabytes in sizes, and allocates a large number of constants. So in this version the file is excluded, making the current build of RcppQuantuccia lighter in size and more suitable for the (simpler, popular and trusted) calendar functions. We also added a new holiday to the US calendar.

The complete list changes follows.

Changes in version 0.0.3 (2019-08-19)
  • Updated Travis CI test file (#8)).

  • Updated US holiday calendar data with G H Bush funeral date (#9).

  • Updated C++ use to not trigger warnings [CRAN request] (#9).

  • Comment-out pragmas to suppress warnings [CRAN Policy] (#9).

  • Change build to exclude Sobol sequence reducing file size for source and shared library, at the cost of excluding market models (#10).

Courtesy of CRANberries, there is also a diffstat report relative to the previous release. More information is on the RcppQuantuccia page. Issues and bugreports should go to the GitHub issue tracker.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Jaskaran Singh: GSoC Final Report

Mar, 20/08/2019 - 2:00pd
Introduction:

The Debian Patch Porting System aims to systematize and partially automate the security patch porting process.

In this Google Summer of Code (2019), I wrote a webcrawler to extract security patches for a given security vulnerability identifier. This webcrawler or patch-finder serves as the first step of the Debian Patch Porting System.

The Patch-finder should recognize numerous vulnerability identifiers. These identifiers can be security advisories (DSA, GLSA, RHSA), vulnerability identifiers (OVAL, CVE), etc. So far, it can identify CVE, DSA (Debian Security Advisory), GLSA (Gentoo Linux Security Advisory) and RHSA (Red Hat Security Advisory).

Each vulnerability identifier has a list of entrypoint URLs associated with it. These URLs are used to initiate the patch finding.

Vulnerabilities that are not CVEs are generic vulnerabilities. If a generic vulnerability is given, its “aliases” (i.e. CVEs that are related to the generic vulnerability) are determined. This method was chosen because CVEs are quite possibly the most widely used security vulnerability and thus would have the most number of patches associated to them. Once the aliases are determined, the entrypoint URLs of the aliases are crawled for the patch-finding.

The Patch-finder is based on the web crawling and scraping framework Scrapy.

What was done:

During these three months, I have:

  • Used Scrapy to implement a spider to collect patch links.
  • Implemented a recursive patch-finding process. Any links that the patch-finder finds on a page (in a certain area of interest, of course) that are not patch links are followed.
  • Implemented a crawler to extract patches from Debian Packages.
  • Implemented a crawler to extract patches from a given GitHub repository.

Here’s a link to the patch-finder’s Github Repository which I have used for GSoC.

TODO:

There is a lot more stuff to be done, from solving small bugs to implementing major features. Some of these issues are on the project’s GitHub issue tracker here. Following is a summary of these issues and a few more ideas:

  • A way to uniquely identify patches. This is so that the same patches are not scraped and collected.
  • A Database, and a corresponding database API.
  • Store patches in the database, along with any other information.
  • Collect not only patches but other information relevant to the vulnerability.
  • Integrate the Github crawler/parser in the crawling process.
  • A way to check the relevancy of the patch to the vulnerability. A naive solution is, of course, to simply check for mention of the vulnerability ID in the patch description.
  • Efficient page filters. Certain links should not be crawled because it is obvious they will not yield any patches, for example homepages.
  • A better way to scrape links, rather than using a URL’s corresponding xpath.
  • A more efficient testing framework.
  • More crawlers/parsers.
Personal Notes:

Google Summer of Code has been a super comfortable and fun experience for me. I’ve learnt tonnes about Python, Open Source and Software Development. My mentors Luciano Bello and László Böszörményi have been immensely helpful and have guided me through these three months.

I plan to continue working on this project and hopefully develop it to a state where Debian and everyone who needs it can use it conveniently.

Jonathan Dowland: Shared notes and TODO lists

Hën, 19/08/2019 - 9:55md

When it comes to organising myself, I've long been anachronistic. I've relied upon paper notebooks for most of my life. In the last 15 years I've stuck to a particular type of diary/notebook hybrid, with a week-to-view on the left-hand side of pages and lined notebook pages on the right.

This worked well for me for my own personal stuff but obviously didn't work well for family things that need to be shared. Trying to find systems that work for both my wife and I has proven really challenging. The best we've come up with so far is a shared (IMAP) account and Apple's notes apps.

On iOS, Apple's low-frills note-taking app lets you synchronise your notes with a mail account (over IMAP). It stores them individually in HTML format, one email per note page, in a mailbox called "Notes". You can set up note syncing to the same account from multiple devices, and so we have a "family" mailbox set up on both my phone and my wife's. I can also get at the notes using any other mail client if I need to.

This works surprisingly well, but not perfectly. In particular synchronising changes to notes can go wrong if we both have the same note page open at the same time. The failure mode is not the worst: it duplicates the note into two; but it's still a problem.

Can anyone recommend a simple, more robust system for sharing notes — and task lists — between people? For task lists, it would be lovely (but not essential) if we could tick things off. At the moment we manage that just as free-form text.

Holger Levsen: 20190818-cccamp

Hën, 19/08/2019 - 8:49md
Home again

Two days ago I finally arrived home again and was greeted with this very nice view when entering the area:

(These images were taken yesterday from inside the venue.)

To give an idea of scale, the Pesthörnchen flag on top is 2m wide

Since today, there's also a rainbow flag next to the Pesthörnchen one. I'm very much looking forward to the next days, though buildup is big fun already.

Antoine Beaupré: KNOB attack: Is my Bluetooth device insecure?

Hën, 19/08/2019 - 7:58md

A recent attack against Bluetooth, called KNOB, has been making waves last week. In essence, it allows an attacker to downgrade the security of a Bluetooth so much that it's possible for the attacker to break the encryption key and spy on all the traffic. The attack is so devastating that some have described it as the "stop using bluetooth" flaw.

This is my attempt at answering my own lingering questions about "can I still use Bluetooth now?" Disclaimer: I'm not an expert in Bluetooth at all, and just base this analysis on my own (limited) knowledge of the protocol, and some articles (including the paper) I read on the topic.

Is Bluetooth still safe?

It really depends what "safe" means, and what your threat model is. I liked how the Ars Technica article put it:

It's also important to note the hurdles—namely the cost of equipment and a surgical-precision MitM—that kept the researchers from actually carrying out their over-the-air attack in their own laboratory. Had the over-the-air technique been easy, they almost certainly would have done it.

In other words, the active attack is really hard to do, and the researchers didn't actually do one at all! It's a theoretical flaw, at this point, and while it's definitely possible, it's not what the researchers did:

The researchers didn't carry out the man-in-the-middle attack over the air. They did, however, root a Nexus 5 device to perform a firmware attack. Based on the response from the other device—a Motorola G3—the researchers said they believe that both attacks would work.

This led some researchers to (boldy) say they would still use a Bluetooth keyboard:

Dan Guido, a mobile security expert and the CEO of security firm Trail of Bits, said: "This is a bad bug, although it is hard to exploit in practice. It requires local proximity, perfect timing, and a clear signal. You need to fully MitM both peers to change the key size and exploit this bug. I'm going to apply the available patches and continue using my bluetooth keyboard."

So, what's safe and what's not, in my much humbler opinion?

Keyboards: bad

The attack is a real killer for Bluetooth keyboards. If an active attack is leveraged, it's game over: everything you type is visible to the attacker, and that includes, critically, passwords. In theory, one could even input keyboard events into the channel, which allows basically arbitrary code execution on the host.

Some, however, made the argument that it's probably easier to implant a keylogger in the device than actually do that attack, but I disagree: this requires physical access, while the KNOB attack can be done remotely.

How far this can be done, by the way, is still open to debate. The Telegraph claimed "a mile" in a click-bait title, but I think such an attacker would need to be much closer for this to work, more in the range of "meters" than "kilometers". But it still means "a black van sitting outside your house" instead of "a dude breaking into your house", which is a significant difference.

Other input devices: hum

I'm not sure mice and other input devices are such a big deal, however. Extracting useful information from those mice moving around the screen is difficult without seeing what's behind that screen.

So unless you use an on-screen keyboard or have special input devices, I don't think those are such a big deal when spied upon.

They could be leveraged with other attacks to make you "click through" some things an attacker would otherwise not be able to do.

Speakers: okay

I think I'll still keep using my Bluetooth speakers. But that's because I don't have much confidential audio I listen to. I listen to music, movies, and silly cat videos; not confidential interviews with victims of repression that should absolutely have their identities protected. And if I ever come across such material, I now know that I should not trust that speaker..

Otherwise, what's an attacker going to do here: listen to my (ever decreasing) voicemail (which is transmitted in cleartext by email anyways)? Listen to that latest hit? Meh.

Do keep in mind that some speakers have microphones in them as well, so that's not the entire story...

Headsets and microphones: hum

Headsets and microphones are another beast, as they can listen to other things in your environment. I do feel much less comfortable using those devices now. What makes the entire thing really iffy is some speakers do have microphones in them and all of a sudden everything around you can listen on your entire life.

(It seems like a given, with "smart home assistants" these days, but I still like to think my private conversations at home are private, in general. And I generally don't want to be near any of those "smart" devices, to be honest.)

One mitigating circumstance here is that the attack needs to happen during the connection (or pairing? still unclear) negociation, which doesn't happen that often if everything works correctly. Unfortunately, this happens more than often exactly with speakers and headsets. That's because many of those devices stupidly have low limits on the number of devices they can pair with. For example, the Bose Soundlink II can only pair with 8 other devices. If you count three device by person (laptop, workstation, phone), you quickly hit the limit when you move the device around. So I end up repairing that device quite often.

And that would be if the attack takes place during the pairing phase. As it turns out, the attack window is much wider: the attack happens during the connexion stage (see Figure 1, page 1049 in the paper), after devices have paired. This actually happens way more often than just during pairing. Any time your speaker or laptop will go to sleep, it will disconnect. Then to start using the device again, the BT layer will renegociate that keysize, and the attack can happen again.

(I have written the authors of the paper to clarify at which stage the attack happens and will update this post when/if they reply. Update: Daniele Antonioli has confirmed the attack takes place at connect phase.)

Fortunarely, the Bose Soundlink II has no microphone, which I'm thankful of. But my Bluetooth headset does have a microphone, which makes me less comfortable.

File and contact transfers: bad

Bluetooth, finally, is also used to transfer stuff other than audio of course. It's clunky, weird and barely working, but it's possible to send files over Bluetooth, and some headsets and car controllers will ask you permission to list your contacts so that "smart" features like "OK Google, call dad please" will work.

This attack makes it possible for an attacker to steal your contacts, when connecting devices. It can also intercept file transfers and so on.

That's pretty bad, to say the least.

Unfortunately, the "connection phase" mitigation described above is less relevant here. It's less likely you'll be continuously connecting two phones (or your phone and laptop) together for the purpose of file transfers. What's more likely is you'll connect the devices for explicit purpose of the file transfer, and therefore an attacker has a window for attack at every transfer.

I don't really use the "contacts" feature anyways (because it creeps me the hell out in the first place), so that's not a problem for me. But the file transfer problem will certainly give me pause the next time I ever need to feel the pain of transfering files over Bluetooth again, which I hope is "never".

It's interesting to note the parallel between this flaw, which will mostly affect Android file transfers, and the recent disclosure of flaws with Apple's Airdrop protocol which was similarly believed to be secure, even though it was opaque and proprietary. Now, think a bit about how Airdrop uses Bluetooth to negociate part of the protocol, and you can feel like I feel that everything in security just somewhat keeps crashes down and we don't seem to be able to make any progress at all.

Overall: meh

I've always been uncomfortable with Bluetooth devices: the pairing process has no sort of authentication whatsoever. The best you get is to enter a pin, and it's often "all zeros" or some trivially easy thing to bruteforce. So Bluetooth security has always felt like a scam, and I especially never trusted keyboards with passwords, in particular.

Like many branded attacks, I think this one might be somewhat overstated. Yes, it's a powerful attack, but Bluetooth implementations are already mostly proprietary junk that is undecipherable from the opensource world. There are no or very few open hardware implementations, so it's somewhat of expected we find things like this.

I have also found the response from the Bluetooth SIG is particularly alarming:

To remedy the vulnerability, the Bluetooth SIG has updated the Bluetooth Core Specification to recommend a minimum encryption key length of 7 octets for BR/EDR connections.

7 octets is 56 bits. That's the equivalent of DES, which was broken in 56 hours back, over 20 years ago. That's far from enough. But what's more disturbing is that this key size negociation protocol might be there "because 'some' governments didn't want other governments to have stronger encryption", ie. it would be a backdoor.

The 7-byte lower bound might also be there because of Apple lobbying. Their AirPods were implemented as not-standards-compliant and already have that lower 7-byte bound, so by fixing the standard to match one Apple implementation, they would reduce the cost of their recall/replacements/upgrades.

Overally, this behavior of the standards body is what should make us suspicious of any Bluetooth device going forward, and question the motivations of the entire Bluetooth standardization process. We can't use 56 bits keys anymore, and I can't believe I need to explicitely say so, but it seems it's where we're at with Bluetooth these days.

Jonathan Dowland: NAS upgrade

Hën, 19/08/2019 - 5:13md

After 5 years of continuous service, the mainboard in my NAS recently failed (at the worst possible moment). I opted to replace the mainboard with a more modern version of the same idea: ASRock J4105-ITX featuring the Intel J4105, an integrated J-series Celeron CPU, designed to be passively cooled, and I've left the rest of the machine as it was.

In the process of researching which CPU/mainboard to buy, I was pointed at the Odroid-H2: a single-board computer (SBC) designed/marketed at a similar sector to things like the Raspberry PI (but featuring the exact same CPU as the mainboard I eventually settled on). I've always felt that the case I'm using for my NAS is too large, but didn't want to spend much money on a smaller one. The ODroid-H2 has a number of cheap, custom-made cases for different use-cases, including one for NAS-style work, which is in a very small footprint: the "Case 1". Unfortunately this case positions two disk drives flat, one vertically above the other, and both above the SBC. I was too concerned that one drive would be heating the other, and cumulatively both heating the SBC at that orientation. The case is designed with a fan but I want to avoid requiring one. I had too many bad memories of trying to control the heat in my first NAS, the Thecus n2100, which (by default) oriented the drives in the same way (and for some reason it never occurred to me to rotate that device into the "toaster" orientation).

I've mildly revised my NAS page to reflect the change. Interestingly most of the niggles I was experiencing were all about the old mainboard, so I've moved them on a separate page (J1900N-D3V) in case they are useful to someone.

At some point in the future I hope to spend a little bit of time on the software side of things, as some of the features of my set up are no longer working as they should: I can't remote-decrypt the main disk via SSH on boot, and the first run of any backup fails due to some kind of race condition in the systemd unit dependencies. (The first attempt does not correctly mount the backup partition; the second attempt always succeeds).

Russ Allbery: Review: Spinning Silver

Hën, 19/08/2019 - 5:07pd

Review: Spinning Silver, by Naomi Novik

Publisher: Del Rey Copyright: 2018 ISBN: 0-399-18100-8 Format: Kindle Pages: 465

Miryem is the daughter of the village moneylender and the granddaughter (via her mother) of a well-respected moneylender in the city. Her grandfather is good at his job. Her father is not. He's always willing to loan the money out, but collecting it is another matter, and the village knows that and takes advantage of it. Each year is harder than the one before, in part because they have less and less money and in part because the winter is getting harsher and colder. When Miryem's mother falls ill, that's the last straw: she takes her father's ledger and goes to collect the money her family is rightfully owed.

Rather to her surprise, she's good at the job in all the ways her father is not. Daring born of desperation turns into persistent, cold anger at the way her family had been taken advantage of. She's good with numbers, has an eye for investments, and is willing to be firm and harden her heart where her father was not. Her success leads to good food, a warmer home, and her mother's recovery. It also leads to the attention of the Staryk.

The Staryk are the elves of Novik's world. They claim everything white in the forest, travel their own mysterious ice road, and raid villages when they choose. And, one night, one of the Staryk comes to Miryem's house and leaves a small bag of Staryk silver coins, challenging her to turn them into the gold the Staryk value so highly.

This is just the start of Spinning Silver, and Miryem is only one of a broadening cast. She demands the service of Wanda and her younger brother as payment for their father's debt, to the delight (hidden from Miryem) of them both since this provides a way to escape their abusive father. The Staryk silver becomes jewelry with surprising magical powers, which Miryem sells to the local duke for his daughter. The duke's daughter, in turn, draws the attention of the czar, who she met as a child when she found him torturing squirrels. And Miryem finds herself caught up in the world of the Staryk, which works according to rules that she can barely understand and may be a trap that she cannot escape.

Novik makes a risky technical choice in this book and pulls it off beautifully: the entirety of Spinning Silver is written in first person with frequently shifting narrators that are not signaled outside of the text. I think there were five different narrators in total, and I may be forgetting some. Despite that, I was never confused for more than a paragraph about who was speaking due to Novik's command of the differing voices. Novik uses this to great effect to show the inner emotions and motivations of the characters without resorting to the distancing effect of wandering third-person.

That's important for this novel because these characters are not emotionally forthcoming. They can't be. Each of them is operating under sharp constraints that make too much emotion unsafe: Wanda and her brother are abused, the Duke's daughter is valuable primarily as a political pawn and later is juggling the frightening attention of the czar, and Miryem is carefully preserving an icy core of anger against her parents' ineffectual empathy and is trying to navigate the perilous and trap-filled world of the Staryk. The caution and occasional coldness of the characters does require the reader do some work to extrapolate emotions, but I thought the overall effect worked.

Miryem's family is, of course, Jewish. The nature of village interactions with moneylenders make that obvious before the book explicitly states it. I thought Novik built some interesting contrasts between Miryem's navigation of the surrounding anti-Semitism and her navigation of the rules of the Staryk, which start off as far more alien than village life but become more systematic and comprehensible than the pervasive anti-Semitism as Miryem learns more. But I was particularly happy that Novik includes the good as well as the bad of Jewish culture among unforgiving neighbors: a powerful sense of family, household religious practices, Jewish weddings, and a cautious but very deep warmth that provides the emotional core for the last part of the book.

Novik also pulls off a rare feat in the plot structure by transforming most of the apparent villains into sympathetic characters and, unlike The Song of Ice and Fire, does this without making everyone awful. The Staryk, the duke, and even the czar are obvious villains on first appearances, but in each case the truth is more complicated and more interesting. The plot of Spinning Silver is satisfyingly complex and ever-changing, with just the right eventual payoffs for being a good (but cautious and smart!) person.

There were places when Spinning Silver got a bit bleak, such as when the story lingered a bit too long on Miryem trying and failing to navigate the Staryk world while getting herself in deeper and deeper, but her core of righteous anger and the protagonists' careful use of all the leverage that they have carried me through. The ending is entirely satisfying and well worth the journey. Recommended.

Rating: 8 out of 10

Markus Koschany: My Free Software Activities in July 2019

Hën, 19/08/2019 - 12:19pd

Welcome to gambaru.de. Here is my monthly report that covers what I have been doing for Debian. If you’re interested in Java, Games and LTS topics, this might be interesting for you.

DebConf 19 in Curitiba

I have been attending DebConf 19 in Curitiba, Brazil from 16.7.2019 to 28.7.2019. I gave two talks about games in Debian and the Long Term Support project, together with Hugo Lefeuvre, Chris Lamb and Holger Levsen. Especially the Games talk had some immediate positive impact. In response to it Reiner Herrmann and Giovanni Mascellani provided patches for release critical bugs related to GCC-9 and the Python 2 removal and we could already fix some of the more important problems for our current release cycle.

I had a lot of fun in Brazil and again met a couple of new and interesting people.  Thanks to all who helped organizing DebConf 19 and made it the great event it was!

Debian Games
  • We are back in business which means packaging new upstream versions of popular games. I packaged new versions of atomix, dreamchess and pygame-sdl2,
  • uploaded minetest 5.0.1 to unstable and backported it later to buster-backports,
  • uploaded new versions of freeorion and warzone2100 to Buster,
  • fixed bug #931415 in freeciv and #925866 in xteddy,
  • became the new uploader of enemylines7.
  • I reviewed and sponsored patches from Reiner Herrmann to port several games to python3-pygame including whichwayisup, funnyboat and monsterz,
  • from Giovanni Mascellani ember and enemylines7.
Debian Java
  • I packaged new upstream versions of robocode, jboss-modules, jboss-jdeparser2, wildfly-common, commons-dbcp2, jboss-logging-tools, jboss-logmanager, libpdfbox2.java, jboss-logging, jboss-xnio, libjide-oss-java,  sweethome3d, sweethome3d-furniture, pdfsam, libsambox-java, libsejda-java, jackson-jr, jackson-dataformat-xml, libsmali-java and apktool.
Misc
  • I updated the popular Firefox/Chromium addons ublock-origin, https-everywhere and privacybadger and also packaged new upstream versions of wabt and binaryen which are both required for building webassembly files from source.
Debian LTS

This was my 41. month as a paid contributor and I have been paid to work 18,5 hours on Debian LTS, a project started by Raphaël Hertzog. In that time I did the following:

  • DLA-1854-1. Issued a security update for libonig fixing 1 CVE.
  • DLA-1860-1. Issued a security update for libxslt fixing 4 CVE.
  • DLA-1846-2. Issued a regression update for unzip to address a Firefox build failure.
  • DLA-1873-1. Issued a security update for proftpd-dfsg fixing 1 CVE.
  • DLA-1886-1. Issued a security update for openjdk-7 fixing 4 CVE.
  • DLA-1890-1. Issued a security update for kde4libs fixing 1 CVE.
  • DLA-1891-1. Reviewed and sponsored a security update for openldap fixing 2 CVE prepared by Ryan Tandy.
ELTS

Extended Long Term Support (ELTS) is a project led by Freexian to further extend the lifetime of Debian releases. It is not an official Debian project but all Debian users benefit from it without cost. The current ELTS release is Debian 7 „Wheezy“. This was my fourteenth month and I have been paid to work 15 hours on ELTS.

  • I was in charge of our ELTS frontdesk from 15.07.2019 until 21.07.2019 and I triaged CVE in openjdk7, libxslt, libonig, php5, wireshark, python2.7, libsdl1.2, patch, suricata and libssh2.
  • ELA-143-1. Issued a security update for libonig fixing 1 CVE.
  • ELA-145-1.  Issued a security update for libxslt fixing 2 CVE.
  • ELA-151-1. Issued a security update for linux fixing 3 CVE.
  • ELA-154-1. Issued a security update for openjdk-7 fixing 4 CVE.

Thanks for reading and see you next time.

Michael Stapelberg: Linux distributions: Can we do without hooks and triggers?

Sht, 17/08/2019 - 6:47md

Hooks are an extension feature provided by all package managers that are used in larger Linux distributions. For example, Debian uses apt, which has various maintainer scripts. Fedora uses rpm, which has scriptlets. Different package managers use different names for the concept, but all of them offer package maintainers the ability to run arbitrary code during package installation and upgrades. Example hook use cases include adding daemon user accounts to your system (e.g. postgres), or generating/updating cache files.

Triggers are a kind of hook which run when other packages are installed. For example, on Debian, the man(1) package comes with a trigger which regenerates the search database index whenever any package installs a manpage. When, for example, the nginx(8) package is installed, a trigger provided by the man(1) package runs.

Over the past few decades, Open Source software has become more and more uniform: instead of each piece of software defining its own rules, a small number of build systems are now widely adopted.

Hence, I think it makes sense to revisit whether offering extension via hooks and triggers is a net win or net loss.

Hooks preclude concurrent package installation

Package managers commonly can make very little assumptions about what hooks do, what preconditions they require, and which conflicts might be caused by running multiple package’s hooks concurrently.

Hence, package managers cannot concurrently install packages. At least the hook/trigger part of the installation needs to happen in sequence.

While it seems technically feasible to retrofit package manager hooks with concurrency primitives such as locks for mutual exclusion between different hook processes, the required overhaul of all hooks¹ seems like such a daunting task that it might be better to just get rid of the hooks instead. Only deleting code frees you from the burden of maintenance, automated testing and debugging.

① In Debian, there are 8620 non-generated maintainer scripts, as reported by find shard*/src/*/debian -regex ".*\(pre\|post\)\(inst\|rm\)$" on a Debian Code Search instance.

Triggers slow down installing/updating other packages

Personally, I never use the apropos(1) command, so I don’t appreciate the man(1) package’s trigger which updates the database used by apropos(1). The process takes a long time and, because hooks and triggers must be executed serially (see previous section), blocks my installation or update.

When I tell people this, they are often surprised to learn about the existance of the apropos(1) command. I suggest adopting an opt-in model.

Unnecessary work if programs are not used between updates

Hooks run when packages are installed. If a package’s contents are not used between two updates, running the hook in the first update could have been skipped. Running the hook lazily when the package contents are used reduces unnecessary work.

As a welcome side-effect, lazy hook evaluation automatically makes the hook work in operating system images, such as live USB thumb drives or SD card images for the Raspberry Pi. Such images must not ship the same crypto keys (e.g. OpenSSH host keys) to all machines, but instead generate a different key on each machine.

Why do users keep packages installed they don’t use? It’s extra work to remember and clean up those packages after use. Plus, users might not realize or value that having fewer packages installed has benefits such as faster updates.

I can also imagine that there are people for whom the cost of re-installing packages incentivizes them to just keep packages installed—you never know when you might need the program again…

Implemented in an interpreted language

While working on hermetic packages (more on that in another blog post), where the contained programs are started with modified environment variables (e.g. PATH) via a wrapper bash script, I noticed that the overhead of those wrapper bash scripts quickly becomes significant. For example, when using the excellent magit interface for Git in Emacs, I encountered second-long delays² when using hermetic packages compared to standard packages. Re-implementing wrappers in a compiled language provided a significant speed-up.

Similarly, getting rid of an extension point which mandates using shell scripts allows us to build an efficient and fast implementation of a predefined set of primitives, where you can reason about their effects and interactions.

② magit needs to run git a few times for displaying the full status, so small overhead quickly adds up.

Incentivizing more upstream standardization

Hooks are an escape hatch for distribution maintainers to express anything which their packaging system cannot express.

Distributions should only rely on well-established interfaces such as autoconf’s classic ./configure && make && make install (including commonly used flags) to build a distribution package. Integrating upstream software into a distribution should not require custom hooks. For example, instead of requiring a hook which updates a cache of schema files, the library used to interact with those files should transparently (re-)generate the cache or fall back to a slower code path.

Distribution maintainers are hard to come by, so we should value their time. In particular, there is a 1:n relationship of packages to distribution package maintainers (software is typically available in multiple Linux distributions), so it makes sense to spend the work in the 1 and have the n benefit.

Can we do without them?

If we want to get rid of hooks, we need another mechanism to achieve what we currently achieve with hooks.

If the hook is not specific to the package, it can be moved to the package manager. The desired system state should either be derived from the package contents (e.g. required system users can be discovered from systemd service files) or declaratively specified in the package build instructions—more on that in another blog post. This turns hooks (arbitrary code) into configuration, which allows the package manager to collapse and sequence the required state changes. E.g., when 5 packages are installed which each need a new system user, the package manager could update /etc/passwd just once.

If the hook is specific to the package, it should be moved into the package contents. This typically means moving the functionality into the program start (or the systemd service file if we are talking about a daemon). If (while?) upstream is not convinced, you can either wrap the program or patch it. Note that this case is relatively rare: I have worked with hundreds of packages and the only package-specific functionality I came across was automatically generating host keys before starting OpenSSH’s sshd(8)³.

There is one exception where moving the hook doesn’t work: packages which modify state outside of the system, such as bootloaders or kernel images.

③ Even that can be moved out of a package-specific hook, as Fedora demonstrates.

Conclusion

Global state modifications performed as part of package installation today use hooks, an overly expressive extension mechanism.

Instead, all modifications should be driven by configuration. This is feasible because there are only a few different kinds of desired state modifications. This makes it possible for package managers to optimize package installation.

Faqet