You are here

Planet Debian

Subscribe to Feed Planet Debian
Entries tagged english Ben Hutchings's diary of life and technology Reproducible builds blog Google Summer of Code 2018 Intern with Debian Chez Charles Bálint's blog about some of the important things in the Universe ganbatte kudasai! Entries tagged english Dude! Sweet! Entries tagged english Google Summer of Code 2018 Intern with Debian Thoughts about programming, sysadmin, Perl, Debian ... Stuff, Debian, Free Software and Craig jmtd Any sufficiently advanced thinking is indistinguishable from madness As time goes by ... Insider infos, master your Debian/Ubuntu distribution Thinking inside the box Digital-Scurf Ramblings Free Software Hacking Recent content in Debian on /home/athos Reproducible builds blog Thoughts, actions and projects Debian and Free Software sesse's blog Thinking inside the box showing latest 10 pabs Entries tagged english Thinking inside the box Recent content in Gsoc18 on bisco.org I began this blog as part of my homework of Master of Libre Software at URJC. I finished my studies but I keep on writing about free (as in freedom) software, networks and knowledge. Dude! Sweet! Recent content in Debian on Tickets'n'patches Just another WordPress.com weblog anarcat Blog from the Debian Project mejo roaming Entries tagged english anarcat joey Debian and Free Software Reproducible builds blog rebel with rather too many causes Thinking inside the box "Passion and dispassion. Choose two." -- Larry Wall something around Debian, written in funny Eng"r"ish ;)
Përditësimi: 7 months 1 javë më parë

Deploying a (simple) docker container system

Mër, 28/02/2018 - 9:54pd

When a small platform for shipping containers is needed, not speaking about Kubernets or something, you have a couple of common things you might want to deploy at first.

Usual things that I have to roll out everytime deloying such a platform:

Bootstraping docker and docker-compose

Most services are build upon multiple containers. A useful tool for doing this is for example docker-compose where you can describe your whole 'application'. So we need to deploy it beside docker itself.

Deploying Watchtower

An essential operational part is to keep you container images up to date.

Watchtower is an application that will monitor your running Docker containers and watch for changes to the images that those containers were originally started from. If watchtower detects that an image has changed, it will automatically restart the container using the new image.

Deploying http(s) reverse proxy Træfik

If you want to provide multiple (web)services on port 80 and 443, you have to think about how this should be solved. Usually you would use a http(s) reverse proxy, there are many of software implementations available.
The challenging part in such an environment is that services may appear and disappear frequently. (Re)-configuration of the proxy service it the gap that needs to be closed.

Træfik (pronounced like traffic) is a modern HTTP reverse proxy and load balancer made to deploy microservices with ease [...] to manage its configuration automatically and dynamically.

Træfik has many interesting features for example 'Let's Encrypt support (Automatic HTTPS with renewal)'.

Jan Wagner https://log.cyconet.org/ Planet - Cyconet Blog

Emacs #1: Ditching a bunch of stuff and moving to Emacs and org-mode

Mar, 27/02/2018 - 11:40md

I’ll admit it. After over a decade of vim, I’m hooked on Emacs.

I’ve long had this frustration over how to organize things. I’ve followed approaches like GTD and ZTD, but things like email or large files are really hard to organize.

I had been using Asana for tasks, Evernote for notes, Thunderbird for email, a combination of ikiwiki and some other items for a personal knowledge base, and various files in an archive directory on my PC. When my new job added Slack to the mix, that was finally the last straw.

A lot of todo-management tools integrate with email — poorly. When you want to do something like “remind me to reply to this in a week”, a lot of times that’s impossible because the tool doesn’t store the email in a fashion you can easily reply to. And that problem is even worse with Slack.

It was right around then that I stumbled onto Carsten Dominik’s Google Talk on org-mode. Carsten was the author of org-mode, and although the talk is 10 years old, it is still highly relevant.

I’d stumbled across org-mode before, but each time I didn’t really dig in because I had the reaction of “an outliner? But I need a todo list.” Turns out I was missing out. org-mode is all that.

Just what IS Emacs? And org-mode?

Emacs grew up as a text editor. It still is, and that heritage is definitely present throughout. But to say Emacs is an editor would be rather unfair.

Emacs is something more like a platform or a toolkit. Not only do you have source code to it, but the very configuration is a program, and there are hooks all over the place. It’s as if it was super easy to write a Firefox plugin. A couple lines, and boom, behavior changed.

org-mode is very similar. Yes, it’s an outliner, but that’s not really what it is. It’s an information organization platform. Its website says “Your life in plain text: Org mode is for keeping notes, maintaining TODO lists, planning projects, and authoring documents with a fast and effective plain-text system.”

Capturing

If you’ve ever read productivity guides based on GTD, one of the things they stress is effortless capture of items. The idea is that when something pops into your head, get it down into a trusted system quickly so you can get on with what you were doing. org-mode has a capture system for just this. I can press C-c c from anywhere in Emacs, and up pops a spot to type my note. But, critically, automatically embedded in that note is a link back to what I was doing when I pressed C-c c. If I was editing a file, it’ll have a link back to that file and the line I was on. If I was viewing an email, it’ll link back to that email (by Message-Id, no less, so it finds it in any folder). Same for participating in a chat, or even viewing another org-mode entry.

So I can make a note that will remind me in a week to reply to a certain email, and when I click the link in that note, it’ll bring up the email in my mail reader — even if I subsequently archived it out of my inbox.

YES, this is what I was looking for!

The tool suite

Once you’re using org-mode, pretty soon you want to integrate everything with it. There are browser plugins for capturing things from the web. Multiple Emacs mail or news readers integrate with it. ERC (IRC client) does as well. So I found myself switching from Thunderbird and mairix+mutt (for the mail archives) to mu4e, and from xchat+slack to ERC.

And wouldn’t you know it, I liked each of those Emacs-based tools better than the standalone they replaced.

A small side tidbit: I’m using OfflineIMAP again! I even used it with GNUS way back when.

One Emacs process to rule them

I used to use Emacs extensively, way back. Back then, Emacs was a “large” program. (Now my battery status applet literally uses more RAM than Emacs). There was this problem of startup time back then, so there was a way to connect to a running Emacs process.

I like to spawn programs with Mod-p (an xmonad shortcut to a dzen menubar, but Alt-F2 in more traditional DEs would do the trick). It’s convenient to not run several emacsen with this setup, so you don’t run into issues with trying to capture to a file that’s open in another one. The solution is very simple: I created a script, named it em, and put it on my path. All it does is this:


#!/bin/bash
exec emacsclient -c -a "" "$@"

It creates a new emacs process if one doesn’t already exist; otherwise, it uses what you’ve got. A bonus here: parameters such as -nw work just fine, so it really acts just as if you’d typed emacs at the shell prompt. It’s a suitable setting for EDITOR.

Up next…

I’ll be talking about my use of, and showing off configurations for:

  • org-mode, including syncing between computers, capturing, agenda and todos, files, linking, keywords and tags, various exporting (slideshows), etc.
  • mu4e for email, including multiple accounts, bbdb integration
  • ERC for IRC and IM

You can also see a list of all articles in this series.

John Goerzen http://changelog.complete.org The Changelog

Woman. Not in tech.

Mar, 27/02/2018 - 10:00md

Thank you, Livia Gabos, for helping me to improve this article by giving me feedback on it.

Before I became an intern with Outreachy, my Twitter bio read: "Woman. Not in tech." Well, if you didn't get the picture, let me explain that meant.

It all began with a simple request I received almost an year ago:

Hey, do you want to join our [company] event and give a talk about being a women in tech?

I don't have a job in the tech industry. So, yes, while society does put me in the 'woman' column, I have to admit it's a little hard to give a talk about being 'in tech' when I'm not 'in tech'.

What I can talk about, though, it's about all the women who are not in tech. The many, many friends I have who come to Women in Tech events and meetings, who reach out to me by e-mail, Twitter or even in person, who are struggling to get into tech.

I can talk about the only other girl in my class who, besides me, managed to get an internship. And how we both only got the position because we had passed a written exam about informatics, instead of going through usual channels such as referrals, CV analysis or interviews.

I can talk about the women who are seen as lazy, or that they just don't get it the lessons in tech courses because they don't have the same background and the same amount of time available to study or to do homework at home as their male peers do, since they have to take care of relatives, take care of children, take care of the housework for their family, most of the times while working in one or two jobs just to be able to study.

I can talk about the women and about the mothers who after many years being denied the possibility for a tech career are daring to change paths, but are denied junior positions in favor of younger men who "can be trained on the job" and have "so much more willingness to learn".

I can talk about the women who are seen as uninterested in one or more FLOSS technologies because they don't contribute to said technology, since the men in FLOSS projects have continuously failed in engage and - most importantly - keep them included (but maybe that's just because women lack role models).

Even though there are so many Women in Tech communities in Curitiba, as listed above, the all-male 'core team' of the local Debian community itself couldn't find a single woman to work with them for the DebConf proposal. Go figure.

I can talk about the many women I met not at tech conferences, but at teachers' conferences, that have way more experience with computers and programming than I. Women who after years working on the field have given up IT to become teachers, not because it was their lifelong dream, but because they didn't feel comfortable and well-integrated in a male-dominated full-of-mysoginistic-ideals tech industry. Because it was - and is - almost impossible for them to break the glass ceiling.

I can even talk about all the women who are lesbians that a certain community of Women In Tech could not find when they wanted someone to write an article about 'being homossexual in tech' to be published right on Brazil's Lesbian Visibility Day, so they had to go and ask a gay man to talk about his own experience. Well, it seems like those women aren't "in tech" either.

Tokenization can be especially apparent when the lone person in a minority group is not only asked to speak for the group, but is consistently asked to speak about being a member of that group. Geek Feminism - Tokenism

The things is, a lot of people don't want to hear any those stories. Companies in particular only want token women from outside the company (because, let's face it, most tech companies can't find the talent within) who will come up to the stage and inspire other women saying what a great experience it is to be in tech - and that "everyone should try it too!".

I do believe all women should try and get knowledge about tech and that is what I work towards. We shouldn't have to rely only on the men in our life to get things done with our computers or our cell phones or our digital life.

But to tell other women they should get into the tech industry? I guess not.

After all, who am I to tell other women they should come to tech - and to stay in tech - when I know we are bound to face all this?

Addendum:

For Brazilian women not in tech, I'm organizing a crowdfunding campaign to get at least five of them the opportunity to attend MiniDebConf in Curitiba, Parana, in April. None of these girls can afford the trip and they don't have a company to sponsor them. If you are willing to help, please get in touch or check this link: Women in MiniDebConf.

More on the subject: Renata https://rsip22.github.io/blog/ Renata's blog

A Nice looking Blog

Enj, 22/02/2018 - 9:00md

I stumbled across this rather nicely-formatted blog by Alex Beal and thought I'd share it. It's a particular kind of minimalist style that I like, because it puts the content first. It reminds me of Mark Pilgrim's old blog.

I can't remember which post in particular I came across first, but the one that I thought I would share was this remarkably detailed personal research project on tracking mood.

That would have been the end of it, but I then stumbled across this great review of "Type Driven Development with Idris", a book by Edwin Brady. I bought this book during the Christmas break but I haven't had much of a chance to deep dive into it yet.

jmtd http://jmtd.net/log/ Jonathan Dowland's Weblog

Dell PowerEdge T30

Enj, 22/02/2018 - 3:06md

I just did a Debian install on a Dell PowerEdge T30 for a client. The Dell web site is a bit broken at the moment, it didn’t list the price of that server or give useful specs when I was ordering it. I was under the impression that the server was limited to 8G of RAM, that’s unusually small but it wouldn’t be the first time a vendor crippled a low end model to drive sales of more expensive systems. It turned out that the T30 model I got has 4*DDR4 sockets with only one used for an 8G DIMM. It apparently can handle up to 64G of RAM.

It has space for 4*3.5″ SATA disks but only has 4*SATA connectors on the motherboard. As I never use the DVD in a server this isn’t a problem for me, but if you want 4 disks and a DVD then you need to buy a PCI or PCIe SATA card.

Compared to the PowerEdge T130 I’m using at home the new T30 is slightly shorter and thinner while seeming to have more space inside. This is partly due to better design and partly due to having 2 hard drives in the top near the DVD drive which are a little inconvenient to get to. The T130 I have (which isn’t the latest model) has 4*3.5″ SATA drive bays at the bottom which are very convenient for swapping disks.

It has two PCIe*16 slots (one of which is apparently quad speed), one shorter PCIe slot, and a PCI slot. For a cheap server a PCI slot is a nice feature, it means I can use an old PCI Ethernet card instead of buying a PCIe Ethernet card. The T30 cost $1002 so using an old Ethernet card saved 1% of the overall cost.

The T30 seems designed to be more of a workstation or personal server than a straight server. The previous iterations of the low end tower servers from Dell didn’t have built in sound and had PCIe slots that were adequate for a RAID controller but vastly inadequate for video. This one has built in line in and out for audio and has two DisplayPort connectors on the motherboard (presumably for dual-head support). Apart from the CPU (an E3-1225 which is slower than some systems people are throwing out nowadays) the system would be a decent gaming system.

It has lots of USB ports which is handy for a file server, I can attach lots of backup devices. Also most of the ports support “super speed”, I haven’t yet tested out USB devices that support such speeds but I’m looking forward to it. It’s a pity that there are no USB-C ports.

One deficiency of the T30 is the lack of a VGA port. It has one HDMI and two DisplayPort sockets on the motherboard, this is really great for a system on or under your desk, any monitor you would want on your desk will support at least one of those interfaces. But in a server room you tend to have an old VGA monitor that’s there because no-one wants it on their desk. Not supporting VGA may force people to buy a $200 monitor for their server room. That increases the effective cost of the system by 20%. It has a PC serial port on the motherboard which is a nice server feature, but that doesn’t make up for the lack of VGA.

The BIOS configuration has an option displayed for enabling charging devices from USB sockets when a laptop is in sleep mode. It’s disappointing that they didn’t either make a BIOS build for a non-laptop or have the BIOS detect at run-time that it’s not on laptop hardware and hide that.

Conclusion

The PowerEdge T30 is a nice low-end workstation. If you want a system with ECC RAM because you need it to be reliable and you don’t need the greatest performance then it will do very well. It has Intel video on the motherboard with HDMI and DisplayPort connectors, this won’t be the fastest video but should do for most workstation tasks. It has a PCIe*16 quad speed slot in case you want to install a really fast video card. The CPU is slow by today’s standards, but Dell sells plenty of tower systems that support faster CPUs.

It’s nice that it has a serial port on the motherboard. That could be used for a serial console or could be used to talk to a UPS or other server-room equipment. But that doesn’t make up for the lack of VGA support IMHO.

One could say that a tower system is designed to be a desktop or desk-side system not run in any sort of server room. However it is cheaper than any rack mounted systems from Dell so it will be deployed in lots of small businesses that have one server for everything – I will probably install them in several other small businesses this year. Also tower servers do end up being deployed in server rooms, all it takes is a small business moving to a serviced office that has a proper server room and the old tower servers end up in a rack.

Rack vs Tower

One reason for small businesses to use tower servers when rack servers are more appropriate is the issue of noise. If your “server room” is the room that has your printer and fax then it typically won’t have a door and you just can’t have the noise of a rack mounted server in there. 1RU systems are inherently noisy because the small diameter of the fans means that they have to spin fast. 2RU systems can be made relatively quiet if you don’t have high-end CPUs but no-one seems to be trying to do that.

I think it would be nice if a company like Dell sold low-end servers in a rack mount form-factor (19 inches wide and 2RU high) that were designed to be relatively quiet. Then instead of starting with a tower server and ending up with tower systems in racks a small business could start with a 19 inch wide system on a shelf that gets bolted into a rack if they move into a better office. Any laptop CPU from the last 10 years is capable of running a file server with 8 disks in a ZFS array. Any modern laptop CPU is capable of running a file server with 8 SSDs in a ZFS array. This wouldn’t be difficult to design.

Related posts:

  1. CPL I’ve just bught an NVidia video card from Computers and...
  2. Flash Storage and Servers In the comments on my post about the Dell PowerEdge...
  3. Dell PowerEdge T105 Today I received a Dell PowerEDGE T105 for use by...
etbe https://etbe.coker.com.au etbe – Russell Coker

How to use the EventCalendar ical

Mër, 21/02/2018 - 11:49md

Hello!

If you follow this blog, you should probably know by now that I have been working with my mentors to contribute to MoinMoin EventCalendar macro, adding the possility to export the events' data to an icalendar file.

The code (which can be found on this Github repository) isn't quite ready yet, because I'm still working to convert the recurrence rule to the icalendar format, but other than that, it should be working. Hopefully.

This guide assumes that you have the EventCalendar macro installed on the wiki and that the macro is called on a determined wikipage.

The icalendar file is now generated as an attachment the moment the macro is loaded. I created an "ical" link at the bottom of the calendar. When activated, this link prompts the download of the ical attachment of the page. Being an attachment, there is still the possibility to just view ical the file using the "attachment" menu if the user wishes to do so.

There are two ways of importing this calendar on Thunderbird. The first one is to download the file by clicking on the link and then proceeding to import it manually to Thunderbird.

The second option is to "Create a new calendar / On the network" and to use the URL address from the ical link as the "location", as it is shown below:

As usual, it's possible to customize the name for the calendar, the color for the events and such...

I noticed a few Wikis that use the EventCalendar, such as Debian wiki itself and the FSFE wiki. Python wiki also seems to be using MoinMoin and EventCalendar, but it seems that they use a Google service to export the event data do iCal.

If you read this and are willing to try the code in your wiki and give me feedback, I would really appreciate. You can find the ways to contact me in my Debian Wiki profile.

Renata https://rsip22.github.io/blog/ Renata's blog

Getting Debian booting on a Lenovo Yoga 720

Mër, 21/02/2018 - 10:46md

I recently got a new work laptop, a 13” Yoga 720. It proved difficult to install Debian on; pressing F12 would get a boot menu allowing me to select a USB stick I have EFI GRUB on, but after GRUB loaded the kernel and the initrd it would just sit there never outputting anything else that indicated the kernel was even starting. I found instructions about Ubuntu 17.10 which helped but weren’t the complete picture. What seems to be the situation is that the kernel won’t happily boot if “Legacy Support” is not enabled - enabling this (and still booting as EFI) results in a happier experience. However in order to be able to enable legacy boot you have to switch the SATA controller from RAID to AHCI, which can cause Windows to get unhappy about its boot device going away unless you warn it first.

  • Fire up an admin shell in Windows (right click on the start menu)
  • bcdedit /set safeboot minimal
  • Reboot into the BIOS
  • Change the SATA Controller mode from RAID to AHCI (dire warnings about “All data will be erased”. It’s not true, but you’ve back up first, right?) Set “Boot Mode” to “Legacy Support”.
  • Save changes and let Windows boot to Safe Mode
  • Fire up an admin shell in Windows (right click on the start menu again)
  • bcdedit /deletevalue safeboot
  • Reboot again and Windows will load in normal mode with the AHCI drivers

Additionally I had problems getting the GRUB entry added to the BIOS; efibootmgr shows it fine but it never appears in the BIOS boot list. I ended up using Windows to add it as the primary boot option using the following (<guid> gets replaced with whatever the new “Debian” section guid is):

bcdedit /enum firmware bcdedit /copy "{bootmgr}" /d "Debian" bcdedit /set "{<guid>}" path \EFI\Debian\grubx64.efi bcdedit /set "{fwbootmgr}" displayorder "{<guid>}" /addfirst

Even with that at one point the BIOS managed to “forget” about the GRUB entry and require me to re-do the final “displayorder” command.

Once you actually have the thing installed and booting it seems fine - I’m running Buster due to the fact it’s a Skylake machine with lots of bits that seem to want a newer kernel, but claimed battery life is impressive, the screen is very shiny (though sometimes a little too shiny and reflective) and the NVMe SSD seems pretty nippy as you’d expect.

Jonathan McDowell https://www.earth.li/~noodles/blog/ Noodles' Emptiness

How hard can typing æ, ø and å be?

Mër, 21/02/2018 - 5:14md

Petter Reinholdtsen: How hard can æ, ø and å be? comments on the rubbish state of till printers and their mishandling of foreign characters.

Last week, I was trying to type an email, on a tablet, in Dutch. The tablet was running something close to Android and I was using a Bluetooth keyboard, which seemed to be configured correctly for my location in England.

Dutch doesn’t even have many accents. I wanted an e acute (é). If you use the on screen keyboard, this is actually pretty easy, just press and hold e and slide to choose the accented one… but holding e on a Bluetooth keyboard? eeeeeeeeeee!

Some guides suggest Alt and e, then e. Apparently that works, but not on keyboards set to Great British… because, I guess, we don’t want any of that foreign muck since the Brexit vote, or something(!)

Even once you figure out that madness and switch the keyboard back to international, which also enables alt i, u, n and so on to do other accents, I can’t find grave, check, breve or several other accents. I managed to send the emails in Dutch but I’d struggle with various other languages.

Have I missed a trick or what are the Android developers thinking? Why isn’t there a Compose key by default? Is there any way to get one?

mjr http://www.news.software.coop mjr – Software Cooperative News

Tools of Love

Mër, 21/02/2018 - 2:00pd
From my spiritual blog

I have been quiet lately. My life has been filled with gentle happiness, work, and less gentle wedding planning. How do you write about quiet happiness without sounding like the least contemplative aspects of Facebook? How do I share this part of the journey in a way that others can learn from? I was offering thanks the other day and was reminded of one of my early experiences at Fires of Venus. Someone was talking about how they were there working to do the spiritual work they needed in order to achieve their dream of opening a restaurant. I'll admit that when I thought of going to a multi-day retreat focused on spiritual connection to love, opening a restaurant had not been at the forefront of my mind. And yet, this was their dream, and surely dreams are the stuff of love. As they continued, they talked about finding self love deep enough to have the confidence to believe in dreams.



As I recalled this experience, I offered thanks for all the tools I've found to use as a lover. Every time I approach something with joy and awe, I gain new insight into the beauty of the world around us. In my work within the IETF I saw the beauty of the digital world we're working to create. Standing on sacred land, I can find the joy and love of nature and the moment.



I can share the joy I find and offer it to others. I've been mentoring someone at work. They're at a point where they're appreciating some of the great mysteries of computing like “Reflections on Trusting Trust” or two's compliment arithmetic. I’ve had the pleasure of watching their moments of discovery and also helping them understand the complex history in how we’ve built the digital world we have. Each moment of delight reinforces the idea that we live in a world where we expect to find this beauty and connect with it. Each experience reinforces the idea that we live in a world filled with things to love.



And so, I’ve turned even my experiences as a programmer into tools for teaching love and joy. I’ve been learning another new tool lately. I’ve been putting together the dance mix for my wedding. Between that and a project last year, I’ve learned a lot about music. I will never be a professional DJ or song producer. However, I have always found joy in music and dance, and I absolutely can be good enough to share that with my friends. I can be good enough to let music and rhythm be tools I use to tell stories and share joy. In learning skills and improving my ability with music, I better appreciate the music I hear.



The same is true with writing: both my work here and my fiction. I’m busy enough with other things that I am unlikely to even attempt writing as my livelihood. Even so, I have more tools for sharing the love I find and helping people find the love and joy in their world.



These are all just tools. Words and song won’t suddenly bring us all together any more than physical affection and our bodies. However, words, song, and the joy we find in each other and in the world we build can help us find connection and empathy. We can learn to see the love that is there between us. All these tools can help us be vulnerable and open together. And that—the changes we work within ourselves using these tools—can bring us to a path of love. And so how do I write about happiness? I give thanks for the things it allows me to explore. I find value in growing and trying new things. In my best moments, each seems a lens through which I can grow as a lover as I walk Venus’s path.


Sam Hartman https://hartmans.livejournal.com/ Ponderings

hartmans @ 2018-02-20T19:46:00

Mër, 21/02/2018 - 1:46pd
Entry reposted from https://hartmans-venus.livejournal.com/22265.html. Sam Hartman https://hartmans.livejournal.com/ Ponderings

Reproducible Builds: Weekly report #147

Mar, 20/02/2018 - 9:56md

Here's what happened in the Reproducible Builds effort between Sunday February 11 and Saturday February 17 2018:

Media coverage Reproducible work in other projects Packages reviewed and fixed, and bugs filed

Various previous patches were merged upstream:

Reviews of unreproducible packages

38 package reviews have been added, 27 have been updated and 13 have been removed in this week, adding to our knowledge about identified issues.

4 issue types have been added:

One issue type has been updated:

Weekly QA work

During our reproducibility testing, FTBFS bugs have been detected and reported by:

  • Adrian Bunk (24)
  • Boyuan Yang (1)
  • Cédric Boutillier (1)
  • Jeremy Bicha (1)
  • Matthias Klose (1)
diffoscope development
  • Chris Lamb:
    • Add support for comparing Berkeley DB files (#890528). This is currently incomplete because the Berkeley DB libraries do not return the same uid/hash reliably (it returns "random" memory contents) so we must strip those from the human-readable output.
Website development Misc.

This week's edition was written by Chris Lamb and Holger Levsen & reviewed by a bunch of Reproducible Builds folks on IRC & the mailing lists.

Reproducible builds folks https://reproducible.alioth.debian.org/blog/ Reproducible builds blog

Lookalikes

Mar, 20/02/2018 - 8:45md

Did I forget a period of my life when I grew a horseshoe mustache and dreadlocks, walked around topless, and illustrated this 2009 article in the Economist on the economic boon that hippy festivals represent to rural American communities?

Previous lookalikes are here.

Benjamin Mako Hill https://mako.cc/copyrighteous copyrighteous

Time to Join Extended Long Term Support for Debian 7 Wheezy

Mar, 20/02/2018 - 5:57md

Debian 7 Wheezy LTS period ends on May 31st and some companies asked Freexian if they could get security support past this date. Since about half of the current team of paid LTS contributors is willing to continue to provide security updates for Wheezy, I have started to work on making this possible.

I just initiated a discussion on debian-devel with multiple Debian teams to see whether it is possible to continue to use debian.org infrastructure to host the wheezy security updates that would be prepared in this extended LTS period.

From the sponsor side, this extended LTS will not work like the regular LTS. It is unrealistic to continue to support all packages and all architectures so only the packages/architectures requested by sponsors will be supported. The amount invoiced to each sponsor will be directly related to the package list that they ask us to support. We made an estimation (based on history) of how much it costs to support each package and we split that cost between all the sponsors that are requesting support for this package. That cost is re-evaluated quarterly and will likely increase over time as sponsors are stopping their support (when they finished to migrate all their machines for example).

This extended LTS will also have some restrictions in terms of packages that we can support. For instance, we will no longer support the linux kernel from wheezy, you will have to switch to the kernel used in jessie (or maybe we will maintain a backport ourselves in wheezy). It is also not yet clear whether we can support OpenJDK since upstream support of version 7 stops at the end of June. And switching to OpenJDK 8 is likely non-trivial. There are likely other unsupportable packages too.

Anyway, if your company needs wheezy security support past end of May, now is the time to worry about it. Please send us a mail with the list of source packages that you would like to see supported. The more companies get involved, the less it will cost to each of them. Our plans are to gather the required data from interested companies in the next few weeks and make a first estimation of the price they will have to pay for the first quarter by mid-march. Then they confirm that they are OK with the offer and we will emit invoices in April so that they can be paid before end of May.

Note however that we decided that it would not be possible to sponsor extended wheezy support (and thus influence which packages are supported) if you are not among the regular LTS sponsors (at bronze level at least). Extended LTS would not be possible without the regular LTS so if you need the former, you have to support the latter too.

No comment | Liked this article? Click here. | My blog is Flattr-enabled.

Raphaël Hertzog https://raphaelhertzog.com apt-get install debian-wizard

Weblate 2.19.1

Mar, 20/02/2018 - 3:00md

Weblate 2.19.1 has been released today. This is bugfix only release mostly to fix problematic migration from 2.18 which some users have observed.

Full list of changes:

  • Fixed migration issue on upgrade from 2.18.
  • Improved file upload API validation.

If you are upgrading from older version, please follow our upgrading instructions.

You can find more information about Weblate on https://weblate.org, the code is hosted on Github. If you are curious how it looks, you can try it out on demo server. Weblate is also being used on https://hosted.weblate.org/ as official translating service for phpMyAdmin, OsmAnd, Turris, FreedomBox, Weblate itself and many other projects.

Should you be looking for hosting of translations for your project, I'm happy to host them for you or help with setting it up on your infrastructure.

Further development of Weblate would not be possible without people providing donations, thanks to everybody who have helped so far! The roadmap for next release is just being prepared, you can influence this by expressing support for individual issues either by comments or by providing bounty for them.

Filed under: Debian English SUSE Weblate

Michal Čihař https://blog.cihar.com/archives/debian/ Michal Čihař's Weblog, posts tagged by Debian

Listing and loading of Debian repositories: now live on Software Heritage

Mar, 20/02/2018 - 2:52md

Software Heritage is the project for which I’ve been working during the past two and a half years now. The grand vision of the project is to build the universal software archive, which will collect, preserve and share the Software Commons.

Today, we’ve announced that Software Heritage is archiving the contents of Debian daily. I’m reposting this article on my blog as it will probably be of interest to readers of Planet Debian.

TL;DR: Software Heritage now archives all source packages of Debian as well as its security archive daily. Everything is ready for archival of other Debian derivatives as well. Keep on reading to get details of the work that made this possible.

History

When we first announced Software Heritage, back in 2016, we had archived the historical contents of Debian as present on the snapshot.debian.org service, as a one-shot proof of concept import.

This code was then left in a drawer and never touched again, until last summer when Sushant came do an internship with us. We’ve had the opportunity to rework the code that was originally written, and to make it more generic: instead of the specifics of snapshot.debian.org, the code can now work with any Debian repository. Which means that we could now archive any of the numerous Debian derivatives that are available out there.

This has been live for a few months, and you can find Debian package origins in the Software Heritage archive now.

Mapping a Debian repository to Software Heritage

The main challenge in listing and saving Debian source packages in Software Heritage is mapping the content of the repository to the generic source history data model we use for our archive.

Organization of a Debian repository

Before we start looking at a bunch of unpacked Debian source packages, we need to know how a Debian repository is actually organized.

At the top level of a Debian repository lays a set of suites, representing versions of the distribution, that is to say a set of packages that have been tested and are known to work together. For instance, Debian currently has 6 active suites, from wheezy (“old old stable” version), all the way up to experimental; Ubuntu has 8, from precise (12.04 LTS), up to bionic (the future 18.04 release), as well as a devel suite. Each of those suites also has a bunch of “overlay” suites, such as backports, which are made available in the archive alongside full suites.

Under the suites, there’s another level of subdivision, which Debian calls components, and Ubuntu calls areas. Debian uses its components to segregate packages along licensing terms (main, contrib and non-free), while Ubuntu uses its areas to denote the level of support of the packages (main, universe, multiverse, …).

Finally, components contain source packages, which merge upstream sources with distribution-specific patches, as well as machine-readable instructions on how to build the package.

Organization of the Software Heritage archive

The Software Heritage archive is project-centric rather than version-centric. What this means is that we are interested in keeping the history of what was available in software origins, which can be thought of as a URL of a repository containing software artifacts, tagged with a type representing the means of access to the repository.

For instance, the origin for the GitHub mirror of the Linux kernel repository has the following data:

For each visit of an origin, we take a snapshot of all the branches (and tagged versions) of the project that were visible during that visit, complete with their full history. See for instance one of the latest visits of the Linux kernel. For the specific case of GitHub, pull requests are also visible as virtual branches, so we fetch those as well (as branches named refs/pull/<pull request number>/head).

Bringing them together

As we’ve seen, Debian archives (just as well as archives for other “traditional” Linux distributions) are release-centric rather than package-centric. Mapping distributions to the Software Heritage archive therefore takes a little bit of gymnastics, to transpose the list of source packages available in each suite to a list of available versions per source package. We do this step by step:

  1. Download the Sources indices for all the suites and components known in the Debian repository
  2. Parse the Sources indices, listing all source packages inside
  3. For each source package, tell the Debian loader to load all the available versions (grouped by name), generating a complete snapshot of the state of the source package across the Debian repository

The source packages are mapped to origins using the following format:

  • type: deb
  • url: deb://<repository name>/packages/<source package name> (e.g. deb://Debian/packages/linux)

We use a repository name rather than the actual URL to a repository so that links can persist even if a given mirror disappears.

Loading Debian source packages

To load Debian source packages into the Software Heritage archive, we have to convert them: Debian-based distributions distribute source packages as a set of files, a dsc (Debian Source Control) and a set of tarballs (usually, an upstream tarball and a Debian-specific overlay). On the other hand, Software Heritage only stores version-control information such as revisions, directories, files.

Unpacking the source packages

Our philosophy at Software Heritage is to store the source code of software in the precise form that allows a developer to start working on it. For Debian source packages, this is the unpacked source code tree, with all patches applied. After checking that the files we have downloaded match the checksums published in the index files, we simply use dpkg-source -x to extract the source package, with patches applied, ready to build. This also means that we currently fail to import packages that don’t extract with the version of dpkg-source available in Debian Stretch.

Generating a synthetic revision

After walking the extracted source package tree, computing identifiers for all its contents, we get the identifier of the top-level tree, which we will reference in the synthetic revision.

The synthetic revision contains the “reproducible” metadata that is completely intrinsic to the Debian source package. With the current implementation, this means:

  • the author of the package, and the date of modification, as referenced in the last entry of the source package changelog (referenced as author and committer)
  • the original artifact (i.e. the information about the original source package)
  • basic information about the history of the package (using the parsed changelog)

However, we never set parent revisions in the synthetic commits, for two reasons:

  • there is no guarantee that packages referenced in the changelog have been uploaded to the distribution, or imported by Software Heritage (our update frequency is lower than that of the Debian archive)
  • even if this guarantee existed, and all versions of all packages were available in Software Heritage, there would be no guarantee that the version referenced in the changelog is indeed the version we imported in the first place

This makes the information stored in the synthetic revision fully intrinsic to the source package, and reproducible. In turn, this allows us to keep a cache, mapping the original artifacts to synthetic revision ids, to avoid loading packages again once we have loaded them once.

Storing the snapshot

Finally, we can generate the top-level object in the Software Heritage archive, the snapshot. For instance, you can see the snapshot for the latest visit of the glibc package.

To do so, we generate a list of branches by concatenating the suite, the component, and the version number of each detected source package (e.g. stretch/main/2.24-10 for version 2.24-10 of the glibc package available in stretch/main). We then point each branch to the synthetic revision that was generated when loading the package version.

In case a version of a package fails to load (for instance, if the package version disappeared from the mirror between the moment we listed the distribution, and the moment we could load the package), we still register the branch name, but we make it a “null” pointer.

There’s still some improvements to make to the lister specific to Debian repositories: it currently hardcodes the list of components/areas in the distribution, as the repository format provides no programmatic way of eliciting them. Currently, only Debian and its security repository are listed.

Looking forward

We believe that the model we developed for the Debian use case is generic enough to capture not only Debian-based distributions, but also RPM-based ones such as Fedora, Mageia, etc. With some extra work, it should also be possible to adapt it for language-centric package repositories such as CPAN, PyPI or Crates.

Software Heritage is now well on the way of providing the foundations for a generic and unified source browser for the history of traditional package-based distributions.

We’ll be delighted to welcome contributors that want to lend a hand to get there.

olasd https://blog.olasd.eu english – olasd's corner of the 'tubes

Hacking at EPFL Toastmasters, Lausanne, tonight

Mar, 20/02/2018 - 12:39md

As mentioned in my earlier blog, I give a talk about Hacking at the Toastmasters club at EPFL tonight. Please feel free to join us and remember to turn off your mobile device or leave it at home, you never know when it might ring or become part of a demonstration.

Daniel.Pocock https://danielpocock.com/tags/debian DanielPocock.com - debian

Freexian’s report about Debian Long Term Support, January 2017

Hën, 19/02/2018 - 6:18md

Like each month, here comes a report about the work of paid contributors to Debian LTS.

Individual reports

In January, about 160 work hours have been dispatched among 11 paid contributors. Their reports are available:

Evolution of the situation

The number of sponsored hours increased slightly at 187 hours per month. It would be nice if the slow growth could continue as the amount of work seems to be slowly growing too.

The security tracker currently lists 23 packages with a known CVE and the dla-needed.txt file 23 too. The number of open issues seems to be stable compared to last month which is a good sign.

Thanks to our sponsors

New sponsors are in bold.

No comment | Liked this article? Click here. | My blog is Flattr-enabled.

Raphaël Hertzog https://raphaelhertzog.com apt-get install debian-wizard

How we care for our child

Hën, 19/02/2018 - 11:15pd

This post is a departure from the regular content, which is supposed to be "Debian and Free Software", but has accidentally turned into a hardware blog recently!

Anyway, we have a child who is now about 14 months old. The way that my wife and I care for him seems logical to us, but often amuses local people. So in the spirit of sharing this is what we do:

  • We divide the day into chunks of time.
  • At any given time one of us is solely responsible for him.
    • The other parent might be nearby, and might help a little.
    • But there is always a designated person who will be changing nappies, feeding, and playing at any given point in the day.
  • The end.

So our weekend routine, covering Saturday and Sunday, looks like this:

  • 07:00-08:00: Husband
  • 08:01-13:00: Wife
  • 13:01-17:00: Husband
  • 17:01-18:00: Wife
  • 18:01-19:30: Husband

Our child, Oiva, seems happy enough with this and he sometimes starts walking from one parent to the other at the appropriate time. But the real benefit is that each of us gets some time off - in my case I get "the morning" off, and my wife gets the afternoon off. We can hide in our bedroom, go shopping, eat cake, or do anything we like.

Week-days are similar, but with the caveat that we both have jobs. I take the morning, and the evenings, and in exchange if he wakes up overnight my wife helps him sleep and settle between 8PM-5AM, and if he wakes up later than 5AM I deal with him.

Most of the time our child sleeps through the night, but if he does wake up it tends to be in the 4:30AM/5AM timeframe. I'm "happy" to wake up at 5AM and stay up until I go to work because I'm a morning person and I tend to go to bed early these days.

Day-care is currently a complex process. There are three families with small children, and ourselves. Each day of the week one family hosts all the children, and the baby-sitter arrives there too (all the families live within a few blocks of each other).

All of the parents go to work, leaving one carer in charge of 4 babies for the day, from 08:15-16:15. On the days when we're hosting the children I greet the carer then go to work - on the days the children are at a different families house I take him there in the morning, on my way to work, and then my wife collects him in the evening.

At the moment things are a bit terrible because most of the children have been a bit sick, and the carer too. When a single child is sick it's mostly OK, unless that is the child which is supposed to be host-venue. If that child is sick we have to panic and pick another house for that day.

Unfortunately if the child-carer is sick then everybody is screwed, and one parent has to stay home from each family. I guess this is the downside compared to sending the children to public-daycare.

This is private day-care, Finnish-style. The social-services (kela) will reimburse each family €700/month if you're in such a scheme, and carers are limited to a maximum of 4 children. The net result is that prices are stable, averaging €900-€1000 per-child, per month.

(The €700 is refunded after a month or two, so in real-terms people like us pay €200-€300/month for Monday-Friday day-care. Plus a bit of beaurocracy over deciding which family is hosting, and which parents are providing food. With the size being capped, and the fees being pretty standard the carers earn €3600-€4000/month, which is a good amount. To be a school-teacher you need to be very qualified, but to do this caring is much simpler. It turns out that being an English-speaker can be a bonus too, for some families ;)

Currently our carer has a sick-note for three days, so I'm staying home today, and will likely stay tomorrow too. Then my wife will skip work on Wednesday. (We usually take it in turns but sometimes that can't happen easily.)

But all of this is due to change in the near future, because we've had too many sick days, and both of us have missed too much work.

More news on that in the future, unless I forget.

Steve Kemp https://blog.steve.fi/ Steve Kemp's Blog

SwissPost putting another nail in the coffin of Swiss sovereignty

Dje, 18/02/2018 - 11:17md

A few people have recently asked me about the SwissID, as SwissPost has just been sending spam emails out to people telling them "Link your Swiss Post user account to SwissID".

This coercive new application of technology demands users email addresses and mobile phone numbers "for security". A web site coercing people to use text messages "for security" has quickly become a red flag for most people and many blogs have already covered why it is only an illusion of security, putting your phone account at risk so companies can profit from another vector for snooping on you.

SwissID is not the only digital identity solution in Switzerland but as it is run by SwissPost and has a name similar to another service it is becoming very well known.

In 2010 they began offering a solution which they call SuisseID (notice the difference? They are pronounced the same way.) based on digital certificates and compliant with Swiss legislation. Public discussion focussed on the obscene cost with little comment about the privacy consequences and what this means for Switzerland as a nation.

Digital certificates often embed an email address in the certificate.

With SwissID, however, they have a web site that looks like little more than vaporware, giving no details at all whether certificates are used. It appears they are basically promoting an app that is designed to harvest the email addresses and phone numbers of any Swiss people who install it, lulling them into that folly by using a name that looks like their original SuisseID. If it looks like phishing, if it feels like phishing and if it smells like phishing to any expert takes a brief sniff of their FAQ, then what else is it?

The thing is, the original SuisseID runs on a standalone smartcard so it doesn't need to have your mobile phone number, have permissions to all the data in your phone and be limited to working in areas with mobile phone signal.

The emails currently being sent by SwissPost tell people they must "Please use a private e-mail address for this purpose" but they don't give any information about the privacy consequences of creating such an account or what their app will do when it has access to read all the messages and contacts in your phone.

The actions you can take that they didn't tell you about
  • You can post a registered letter to SwissPost and tell them that for privacy reasons, you are immediately retracting the email addresses and mobile phone numbers they currently hold on file and that you are exercising your right not to give an email address or mobile phone number to them in future.
  • If you do decide you want a SwissID, create a unique email address for it and only use that email address with SwissPost so that it can't be cross-referenced with other companies. This email address is also like a canary in a coal mine: if you start receiving spam on that email address then you know SwissPost/SwissID may have been hacked or the data has been leaked or sold.
  • Don't install their app and if you did, remove it and you may want to change your mobile phone number.

Oddly enough, none of these privacy-protecting ideas were suggested in the email from SwissPost. Who's side are they on?

Why should people be concerned?

SwissPost, like every postal agency, has seen traditional revenues drop and so they seek to generate more revenue from direct marketing and they are constantly looking for ways to extract and profit from data about the public. They are also a huge company with many employees: when dealing with vast amounts of data in any computer system, it only takes one employee to compromise everything: just think of how Edward Snowden was able to act alone to extract many of the NSA's most valuable secrets.

SwissPost is going to great lengths to get accurate data on every citizen and resident in Switzerland, including deploying an app to get your mobile phone number and demanding an email address when you use their web site. That also allows them to cross-reference with your IP addresses.

  • Any person or organization who has your email address or mobile number may find it easier to get your home address.
  • Any person or organization who has your home address may be able to get your email address or mobile phone number.
  • When you call a company from your mobile phone and their system recognizes your phone number, it becomes easier for them to match it to your home address.
  • If SwissPost and the SBB successfully convince a lot of people to use a SwissID, some other large web sites may refuse to allow access without getting you to link them to your SwissID and all the data behind it too. Think of how many websites already try to coerce you to give them your mobile phone number and birthday to "secure" your account, but worse.

The Google factor

The creepiest thing is that over seventy percent of people are apparently using Gmail addresses in Switzerland and these will be a dependency of their registration for SwissID.

Given that SwissID is being promoted as a solution compliant with ZertES legislation that can act as an interface between citizens and the state, the intersection with such a powerful foreign actor as Gmail is extraordinary. For example, if people are registering to vote in Switzerland's renowned referendums and their communication is under the surveillance of a foreign power like the US, that is a mockery of democracy and it makes the allegations of Russian election hacking look like child's play.

Switzerland's referendums, decentralized system of Government, part-time army and privacy regime are all features that maintain a balance between citizen and state: by centralizing power in the hands of SwissID and foreign IT companies, doesn't it appear that the very name SwissID is a mockery of the Swiss identity?

No canaries were harmed in the production of this blog.

Daniel.Pocock https://danielpocock.com/tags/debian DanielPocock.com - debian

Vnlog!

Dje, 18/02/2018 - 12:58md

In the last few jobs I've worked at I ended up writing a tool to store data in a nice format, and to be able to manipulate it easily. I'd rewrite this from scratch each time partly because I was never satisfied with the previous version. Each iteration was an improvement on the previous one, and this version is the good one. I wrote it at NASA/JPL, went through the release process (this thing was called asciilog then), added a few more features, and I'm now releasing it. The toolkit lives here and here's the initial README:

Summary

Vnlog (pronounced "vanillog") is a trivially-simple log format:

  • A whitespace-separated table of ASCII human-readable text
  • Lines beginning with # are comments
  • The first line that begins with a single # (not ## or #!) is a legend, naming each column

Example:

#!/usr/bin/whatever # a b c 1 2 3 ## another comment 4 5 6

Such data works very nicely with normal UNIX tools (awk, sort, join), can be easily read by fancier tools (numpy, matlab (yuck), excel (yuck), etc), and can be plotted with feedgnuplot. This tookit provides some tools to manipulate vnlog data and a few libraries to read/write it. The core philosophy is to keep everything as simple and light as possible, and to provide methods to enable existing (and familiar!) tools and workflows to be utilized in nicer ways.

Synopsis

In one terminal, sample the CPU temperature over time, and write the data to a file as it comes in, at 1Hz:

$ ( echo '# time temp1 temp2 temp3'; while true; do echo -n "`date +%s` "; < /proc/acpi/ibm/thermal awk '{print $2,$3,$4; fflush()}'; sleep 1; done ) > /tmp/temperature.vnl

In another terminal, I sample the consumption of CPU resources, and log that to a file:

$ (echo "# user system nice idle waiting hardware_interrupt software_interrupt stolen"; top -b -d1 | awk '/%Cpu/ {print $2,$4,$6,$8,$10,$12,$14,$16; fflush()}') > /tmp/cpu.vnl

These logs are now accumulating, and I can do stuff with them. The legend and the last few measurements:

$ vnl-tail /tmp/temperature.vnl # time temp1 temp2 temp3 1517986631 44 38 34 1517986632 44 38 34 1517986633 44 38 34 1517986634 44 38 35 1517986635 44 38 35 1517986636 44 38 35 1517986637 44 38 35 1517986638 44 38 35 1517986639 44 38 35 1517986640 44 38 34

I grab just the first temperature sensor, and align the columns

$ < /tmp/temperature.vnl vnl-tail | vnl-filter -p time,temp=temp1 | vnl-align # time temp 1517986746 45 1517986747 45 1517986748 46 1517986749 46 1517986750 46 1517986751 46 1517986752 46 1517986753 45 1517986754 45 1517986755 45

I do the same, but read the log data in realtime, and feed it to a plotting tool to get a live reporting of the cpu temperature. This plot updates as data comes in. I then spin a CPU core (while true; do true; done), and see the temperature climb. Here I'm making an ASCII plot that's pasteable into the docs.

$ < /tmp/temperature.vnl vnl-tail -f | vnl-filter --unbuffered -p time,temp=temp1 | feedgnuplot --stream --domain --lines --timefmt '%s' --set 'format x "%M:%S"' --ymin 40 --unset grid --terminal 'dumb 80,40' 70 +----------------------------------------------------------------------+ | + + + + + + + + + | | | | | | | | ** | 65 |-+ *** +-| | ** * | | * * | | * * | | * * | | ** ** | 60 |-+ * * +-| | * * | | * * | | * * | | ** * | | * * | 55 |-+ * * +-| | * * | | * ** | | * * | | ** * | | * ** | 50 |-+ * ** +-| | * ** | | * *** | | * * | | * **** | | * ***** | 45 |-+ * *********** +-| | ************ ********************** | | * ** | | | | | | + + + + + + + + + | 40 +----------------------------------------------------------------------+ 21:00 22:00 23:00 24:00 25:00 26:00 27:00 28:00 29:00 30:00 31:00

Cool. I can then join the logs, pull out simultaneous CPU consumption and temperature numbers, and plot the path in the temperature-cpu space:

$ vnl-join -j time /tmp/temperature.vnl /tmp/cpu.vnl | vnl-filter -p temp1,user | feedgnuplot --domain --lines --unset grid --terminal 'dumb 80,40' 45 +----------------------------------------------------------------------+ | + + + + + | | * | | * | 40 |-+ ** +-| | ** | | * * | | * * * * * | 35 |-+ **** *********** **** * **** *** ****** +-| | ********* ******** * * ***** *** * ** * * | | * * * * * * * ** * * * | | * * * * * * | 30 |-+ * * +-| | * * | | * * | | * * | 25 |-+ * * +-| | * * | | * * | | * * | 20 |-+ * * +-| | * * | | * * | | * * * | 15 |-+ * * * * +-| | * * * * | | *** * * | | *** * * | 10 |-+ *** * * +-| | *** * * | | *** * * | | *** * * | 5 |-+ *** * * +-| | *** * * | | * * * * * | | **** * ** ***** *********** + ******* ***** | 0 +----------------------------------------------------------------------+ 40 45 50 55 60 65 70 Description

As stated before, vnlog tools are designed to be very simple and light. There exist other tools that are similar. For instance:

These all provide facilities to run various analyses, and are neither simple nor light. Vnlog by contrast doesn't analyze anything, but makes it easy to write simple bits of awk or perl to process stuff to your heart's content. The main envisioned use case is one-liners, and the tools are geared for that purpose. The above mentioned tools are much more powerful than vnlog, so they could be a better fit for some use cases.

In the spirit of doing as little as possible, the provided tools are wrappers around tools you already have and are familiar with. The provided tools are:

  • vnl-filter is a tool to select a subset of the rows/columns in a vnlog and/or to manipulate the contents. This is effectively an awk wrapper where the fields can be referenced by name instead of index. 20-second tutorial:
vnl-filter -p col1,col2,colx=col3+col4 'col5 > 10' --has col6

will read the input, and produce a vnlog with 3 columns: col1 and col2 from the input and a column colx that's the sum of col3 and col4 in the input. Only those rows for which the col5 > 10 is true will be output. Finally, only those rows that have a non-null value for col6 will be selected. A null entry is signified by a single - character.

vnl-filter --eval '{s += x} END {print s}'

will evaluate the given awk program on the input, but the column names work as you would hope they do: if the input has a column named x, this would produce the sum of all values in this column.

  • vnl-sort, vnl-join, vnl-tail are wrappers around the corresponding GNU Coreutils tools. These work exactly as you would expect also: the columns can be referenced by name, and the legend comment is handled properly. These are wrappers, so all the commandline options those tools have "just work" (except options that don't make sense in the context of vnlog). As an example, vnl-tail -f will follow a log: data will be read by vnl-tail as it is written into the log (just like tail -f, but handling the legend properly). And you already know how to use these tools without even reading the manpages! Note: these were written for and have been tested with the Linux kernel and GNU Coreutils sort, join and tail. Other kernels and tools probably don't (yet) work. Send me patches.
  • vnl-align aligns vnlog columns for easy interpretation by humans. The meaning is unaffected
  • Vnlog::Parser is a simple perl library to read a vnlog
  • libvnlog is a C library to simplify writing a vnlog. Clearly all you really need is printf(), but this is useful if we have lots of columns, many containing null values in any given row, and/or if we have parallel threads writing to a log
  • vnl-make-matrix converts a one-point-per-line vnlog to a matrix of data. I.e.
$ cat dat.vnl # i j x 0 0 1 0 1 2 0 2 3 1 0 4 1 1 5 1 2 6 2 0 7 2 1 8 2 2 9 3 0 10 3 1 11 3 2 12 $ < dat.vnl vnl-filter -p i,x | vnl-make-matrix --outdir /tmp Writing to '/tmp/x.matrix' $ cat /tmp/x.matrix 1 2 3 4 5 6 7 8 9 10 11 12

All the tools have manpages that contain more detail. And tools will probably be added with time.

C interface

For most uses, these logfiles are simple enough to be generated with plain prints. But then each print statement has to know which numeric column we're populating, which becomes effortful with many columns. In my usage it's common to have a large parallelized C program that's writing logs with hundreds of columns where any one record would contain only a subset of the columns. In such a case, it's helpful to have a library that can output the log files. This is available. Basic usage looks like this:

In a shell:

$ vnl-gen-header 'int w' 'uint8_t x' 'char* y' 'double z' 'void* binary' > vnlog_fields_generated.h

In a C program test.c:

#include "vnlog_fields_generated.h" int main() { vnlog_emit_legend(); vnlog_set_field_value__w(-10); vnlog_set_field_value__x(40); vnlog_set_field_value__y("asdf"); vnlog_emit_record(); vnlog_set_field_value__z(0.3); vnlog_set_field_value__x(50); vnlog_set_field_value__w(-20); vnlog_set_field_value__binary("\x01\x02\x03", 3); vnlog_emit_record(); vnlog_set_field_value__w(-30); vnlog_set_field_value__x(10); vnlog_set_field_value__y("whoa"); vnlog_set_field_value__z(0.5); vnlog_emit_record(); return 0; }

Then we build and run, and we get

$ cc -o test test.c -lvnlog $ ./test # w x y z binary -10 40 asdf - - -20 50 - 0.2999999999999999889 AQID -30 10 whoa 0.5 -

The binary field in base64-encoded. This is a rarely-used feature, but sometimes you really need to log binary data for later processing, and this makes it possible.

So you

  1. Generate the header to define your columns
  2. Call vnlog_emit_legend()
  3. Call vnlog_set_field_value__...() for each field you want to set in that row.
  4. Call vnlog_emit_record() to write the row and to reset all fields for the next row. Any fields unset with a vnlog_set_field_value__...() call are written as null: -

This is enough for 99% of the use cases. Things get a bit more complex if we have have threading or if we have multiple vnlog ouput streams in the same program. For both of these we use vnlog contexts.

To support reentrant writing into the same vnlog by multiple threads, each log-writer should create a context, and use it when talking to vnlog. The context functions will make sure that the fields in each context are independent and that the output records won't clobber each other:

void child_writer( // the parent context also writes to this vnlog. Pass NULL to // use the global one struct vnlog_context_t* ctx_parent ) { struct vnlog_context_t ctx; vnlog_init_child_ctx(&ctx, ctx_parent); while(records) { vnlog_set_field_value_ctx__xxx(&ctx, ...); vnlog_set_field_value_ctx__yyy(&ctx, ...); vnlog_set_field_value_ctx__zzz(&ctx, ...); vnlog_emit_record_ctx(&ctx); } }

If we want to have multiple independent vnlog writers to different streams (with different columns andlegends), we do this instead:

file1.c:

#include "vnlog_fields_generated1.h" void f(void) { // Write some data out to the default context and default output (STDOUT) vnlog_emit_legend(); ... vnlog_set_field_value__xxx(...); vnlog_set_field_value__yyy(...); ... vnlog_emit_record(); }

file2.c:

#include "vnlog_fields_generated2.h" void g(void) { // Make a new session context, send output to a different file, write // out legend, and send out the data struct vnlog_context_t ctx; vnlog_init_session_ctx(&ctx); FILE* fp = fopen(...); vnlog_set_output_FILE(&ctx, fp); vnlog_emit_legend_ctx(&ctx); ... vnlog_set_field_value__a(...); vnlog_set_field_value__b(...); ... vnlog_emit_record(); }

Note that it's the user's responsibility to make sure the new sessions go to a different FILE by invoking vnlog_set_output_FILE(). Furthermore, note that the included vnlog_fields_....h file defines the fields we're writing to; and if we have multiple different vnlog field definitions in the same program (as in this example), then the different writers must live in different source files. The compiler will barf if you try to #include two different vnlog_fields_....h files in the same source.

More APIs are

vnlog_printf(...) and vnlog_printf_ctx(ctx, ...) write to a pipe like printf() does. This exists for comments.

vnlog_clear_fields_ctx(ctx, do_free_binary): Clears out the data in a context and makes it ready to be used for the next record. It is rare for the user to have to call this manually. The most common case is handled automatically (clearing out a context after emitting a record). One area where this is useful is when making a copy of a context:

struct vnlog_context_t ctx1; // .... do stuff with ctx1 ... add data to it ... struct vnlog_context_t ctx2 = ctx1; // ctx1 and ctx2 now both have the same data, and the same pointers to // binary data. I need to get rid of the pointer references in ctx1 vnlog_clear_fields_ctx(&ctx1, false);

vnlog_free_ctx(ctx):

Frees memory for an vnlog context. Do this before throwing the context away. Currently this is only needed for context that have binary fields, but this should be called in for all contexts, just in case

numpy interface

The built-in numpy.loadtxt numpy.savetxt functions work well to read and write these files. For example to write to standard output a vnlog with fields a, b and c:

numpy.savetxt(sys.stdout, array, fmt="%g", header="a b c")

Note that numpy automatically adds the # to the header. To read a vnlog from a file on disk, do something like

array = numpy.loadtxt('data.vnl')

These functions know that # lines are comments, but don't interpret anything as field headers. That's easy to do, so I'm not providing any helper libraries. I might do that at some point, but in the meantime, patches are welcome.

Caveats and bugs

The tools that wrap GNU coreutils (vnl-sort, vnl-join, vnl-tail) are written specifically to work with the Linux kernel and the GNU coreutils. None of these have been tested with BSD tools or with non-Linux kernels, and I'm sure things don't just work. It's probably not too effortful to get that running, but somebody needs to at least bug me for that. Or better yet, send me nice patches :)

These tools are meant to be simple, so some things are hard requirements. A big one is that columns are whitespace-separated. There is no mechanism for escaping or quoting whitespace into a single field. I think supporting something like that is more trouble than it's worth.

Repository

https://github.com/dkogan/vnlog/

Author

Dima Kogan (dima@secretsauce.net)

License and copyright

This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.

Copyright 2016-2017 California Institute of Technology

Copyright 2017-2018 Dima Kogan (dima@secretsauce.net)

b64_cencode.c comes from cencode.c in the libb64 project. It is written by Chris Venter (chris.venter@gmail.com) who placed it in the public domain. The full text of the license is in that file.

Dima Kogan http://notes.secretsauce.net Dima Kogan

Faqet