You are here

Agreguesi i feed

Sebastian Dröge: How to write GStreamer Elements in Rust Part 1: A Video Filter for converting RGB to grayscale

Planet GNOME - Sht, 13/01/2018 - 11:24md

This is part one of a series of blog posts that I’ll write in the next weeks, as previously announced in the GStreamer Rust bindings 0.10.0 release blog post. Since the last series of blog posts about writing GStreamer plugins in Rust ([1] [2] [3] [4]) a lot has changed, and the content of those blog posts has only historical value now, as the journey of experimentation to what exists now.

In this first part we’re going to write a plugin that contains a video filter element. The video filter can convert from RGB to grayscale, either output as 8-bit per pixel grayscale or 32-bit per pixel RGB. In addition there’s a property to invert all grayscale values, or to shift them by up to 255 values. In the end this will allow you to watch Big Bucky Bunny, or anything else really that can somehow go into a GStreamer pipeline, in grayscale. Or encode the output to a new video file, send it over the network via WebRTC or something else, or basically do anything you want with it.

Big Bucky Bunny – Grayscale

This will show the basics of how to write a GStreamer plugin and element in Rust: the basic setup for registering a type and implementing it in Rust, and how to use the various GStreamer API and APIs from the Rust standard library to do the processing.

The final code for this plugin can be found here, and it is based on the 0.1 version of the gst-plugin crate and the 0.10 version of the gstreamer crate. At least Rust 1.20 is required for all this. I’m also assuming that you have GStreamer (at least version 1.8) installed for your platform, see e.g. the GStreamer bindings installation instructions.

Table of Contents
  1. Project Structure
  2. Plugin Initialization
  3. Type Registration
  4. Type Class & Instance Initialization
  5. Caps & Pad Templates
  6. Caps Handling Part 1
  7. Caps Handling Part 2
  8. Conversion of BGRx Video Frames to Grayscale
  9. Testing the new element
  10. Properties
  11. What next?
Project Structure

We’ll create a new cargo project with cargo init –lib –name gst-plugin-tutorial. This will create a basically empty Cargo.toml and a corresponding src/lib.rs. We will use this structure: lib.rs will contain all the plugin related code, separate modules will contain any GStreamer plugins that are added.

The empty Cargo.toml has to be updated to list all the dependencies that we need, and to define that the crate should result in a cdylib, i.e. a C library that does not contain any Rust-specific metadata. The final Cargo.toml looks as follows

[package] name = "gst-plugin-tutorial" version = "0.1.0" authors = ["Sebastian Dröge <sebastian@centricular.com>"] repository = "https://github.com/sdroege/gst-plugin-rs" license = "MIT/Apache-2.0" [dependencies] glib = "0.4" gstreamer = "0.10" gstreamer-base = "0.10" gstreamer-video = "0.10" gst-plugin = "0.1" [lib] name = "gstrstutorial" crate-type = ["cdylib"] path = "src/lib.rs"

We’re depending on the gst-plugin crate, which provides all the basic infrastructure for implementing GStreamer plugins and elements. In addition we depend on the gstreamer, gstreamer-base and gstreamer-video crates for various GStreamer API that we’re going to use later, and the glib crate to be able to use some GLib API that we’ll need. GStreamer is building upon GLib, and this leaks through in various places.

With the basic project structure being set-up, we should be able to compile the project with cargo build now, which will download and build all dependencies and then creates a file called target/debug/libgstrstutorial.so (or .dll on Windows, .dylib on macOS). This is going to be our GStreamer plugin.

To allow GStreamer to find our new plugin and make it available in every GStreamer-based application, we could install it into the system- or user-wide GStreamer plugin path or simply point the GST_PLUGIN_PATH environment variable to the directory containing it:

export GST_PLUGIN_PATH=`pwd`/target/debug

If you now run the gst-inspect-1.0 tool on the libgstrstutorial.so, it will not yet print all information it can extract from the plugin but for now just complains that this is not a valid GStreamer plugin. Which is true, we didn’t write any code for it yet.

Plugin Initialization

Let’s start editing src/lib.rs to make this an actual GStreamer plugin. First of all, we need to add various extern crate directives to be able to use our dependencies and also mark some of them #[macro_use] because we’re going to use macros defined in some of them. This looks like the following

extern crate glib; #[macro_use] extern crate gstreamer as gst; extern crate gstreamer_base as gst_base; extern crate gstreamer_video as gst_video; #[macro_use] extern crate gst_plugin;

Next we make use of the plugin_define! macro from the gst-plugin crate to set-up the static metadata of the plugin (and make the shared library recognizeable by GStreamer to be a valid plugin), and to define the name of our entry point function (plugin_init) where we will register all the elements that this plugin provides.

plugin_define!( b"rstutorial\0", b"Rust Tutorial Plugin\0", plugin_init, b"1.0\0", b"MIT/X11\0", b"rstutorial\0", b"rstutorial\0", b"https://github.com/sdroege/gst-plugin-rs\0", b"2017-12-30\0" );

This is unfortunately not very beautiful yet due to a) GStreamer requiring this information to be statically available in the shared library, not returned by a function (starting with GStreamer 1.14 it can be a function), and b) Rust not allowing raw strings (b”blabla) to be concatenated with a macro like the std::concat macro (so that the b and \0 parts could be hidden away). Expect this to become better in the future.

The static plugin metadata that we provide here is

  1. name of the plugin
  2. short description for the plugin
  3. name of the plugin entry point function
  4. version number of the plugin
  5. license of the plugin (only a fixed set of licenses is allowed here, see)
  6. source package name
  7. binary package name (only really makes sense for e.g. Linux distributions)
  8. origin of the plugin
  9. release date of this version

In addition we’re defining an empty plugin entry point function that just returns true

fn plugin_init(plugin: &gst::Plugin) -> bool { true }

With all that given, gst-inspect-1.0 should print exactly this information when running on the libgstrstutorial.so file (or .dll on Windows, or .dylib on macOS)

gst-inspect-1.0 target/debug/libgstrstutorial.so

Type Registration

As a next step, we’re going to add another module rgb2gray to our project, and call a function called register from our plugin_init function.

mod rgb2gray; fn plugin_init(plugin: &gst::Plugin) -> bool { rgb2gray::register(plugin); true }

With that our src/lib.rs is complete, and all following code is only in src/rgb2gray.rs. At the top of the new file we first need to add various use-directives to import various types and functions we’re going to use into the current module’s scope

use glib; use gst; use gst::prelude::*; use gst_video; use gst_plugin::properties::*; use gst_plugin::object::*; use gst_plugin::element::*; use gst_plugin::base_transform::*; use std::i32; use std::sync::Mutex;

GStreamer is based on the GLib object system (GObject). C (just like Rust) does not have built-in support for object orientated programming, inheritance, virtual methods and related concepts, and GObject makes these features available in C as a library. Without language support this is a quite verbose endeavour in C, and the gst-plugin crate tries to expose all this in a (as much as possible) Rust-style API while hiding all the details that do not really matter.

So, as a next step we need to register a new type for our RGB to Grayscale converter GStreamer element with the GObject type system, and then register that type with GStreamer to be able to create new instances of it. We do this with the following code

struct Rgb2GrayStatic; impl ImplTypeStatic<BaseTransform> for Rgb2GrayStatic { fn get_name(&self) -> &str { "Rgb2Gray" } fn new(&self, element: &BaseTransform) -> Box<BaseTransformImpl<BaseTransform>> { Rgb2Gray::new(element) } fn class_init(&self, klass: &mut BaseTransformClass) { Rgb2Gray::class_init(klass); } } pub fn register(plugin: &gst::Plugin) { let type_ = register_type(Rgb2GrayStatic); gst::Element::register(plugin, "rsrgb2gray", 0, type_); }

This defines a zero-sized struct Rgb2GrayStatic that is used to implement the ImplTypeStatic<BaseTransform> trait on it for providing static information about the type to the type system. In our case this is a zero-sized struct, but in other cases this struct might contain actual data (for example if the same element code is used for multiple elements, e.g. when wrapping a generic codec API that provides support for multiple decoders and then wanting to register one element per decoder). By implementing ImplTypeStatic<BaseTransform> we also declare that our element is going to be based on the GStreamer BaseTransform base class, which provides a relatively simple API for 1:1 transformation elements like ours is going to be.

ImplTypeStatic provides functions that return a name for the type, and functions for initializing/returning a new instance of our element (new) and for initializing the class metadata (class_init, more on that later). We simply let those functions proxy to associated functions on the Rgb2Gray struct that we’re going to define at a later time.

In addition, we also define a register function (the one that is already called from our plugin_init function) and in there first register the Rgb2GrayStatic type metadata with the GObject type system to retrieve a type ID, and then register this type ID to GStreamer to be able to create new instances of it with the name “rsrgb2gray” (e.g. when using gst::ElementFactory::make).

Type Class & Instance Initialization

As a next step we declare the Rgb2Gray struct and implement the new and class_init functions on it. In the first version, this struct is almost empty but we will later use it to store all state of our element.

struct Rgb2Gray { cat: gst::DebugCategory, } impl Rgb2Gray { fn new(_transform: &BaseTransform) -> Box<BaseTransformImpl<BaseTransform>> { Box::new(Self { cat: gst::DebugCategory::new( "rsrgb2gray", gst::DebugColorFlags::empty(), "Rust RGB-GRAY converter", ), }) } fn class_init(klass: &mut BaseTransformClass) { klass.set_metadata( "RGB-GRAY Converter", "Filter/Effect/Converter/Video", "Converts RGB to GRAY or grayscale RGB", "Sebastian Dröge <sebastian@centricular.com>", ); klass.configure(BaseTransformMode::NeverInPlace, false, false); } }

In the new function we return a boxed (i.e. heap-allocated) version of our struct, containing a newly created GStreamer debug category of name “rsrgb2gray”. We’re going to use this debug category later for making use of GStreamer’s debug logging system for logging the state and changes of our element.

In the class_init function we, again, set up some metadata for our new element. In this case these are a description, a classification of our element, a longer description and the author. The metadata can later be retrieved and made use of via the Registry and PluginFeature/ElementFactory API. We also configure the BaseTransform class and define that we will never operate in-place (producing our output in the input buffer), and that we don’t want to work in passthrough mode if the input/output formats are the same.

Additionally we need to implement various traits on the Rgb2Gray struct, which will later be used to override virtual methods of the various parent classes of our element. For now we can keep the trait implementations empty. There is one trait implementation required per parent class.

impl ObjectImpl<BaseTransform> for Rgb2Gray {} impl ElementImpl<BaseTransform> for Rgb2Gray {} impl BaseTransformImpl<BaseTransform> for Rgb2Gray {}

With all this defined, gst-inspect-1.0 should be able to show some more information about our element already but will still complain that it’s not complete yet.

Caps & Pad Templates

Data flow of GStreamer elements is happening via pads, which are the input(s) and output(s) (or sinks and sources) of an element. Via the pads, buffers containing actual media data, events or queries are transferred. An element can have any number of sink and source pads, but our new element will only have one of each.

To be able to declare what kinds of pads an element can create (they are not necessarily all static but could be created at runtime by the element or the application), it is necessary to install so-called pad templates during the class initialization. These pad templates contain the name (or rather “name template”, it could be something like src_%u for e.g. pad templates that declare multiple possible pads), the direction of the pad (sink or source), the availability of the pad (is it always there, sometimes added/removed by the element or to be requested by the application) and all the possible media types (called caps) that the pad can consume (sink pads) or produce (src pads).

In our case we only have always pads, one sink pad called “sink”, on which we can only accept RGB (BGRx to be exact) data with any width/height/framerate and one source pad called “src”, on which we will produce either RGB (BGRx) data or GRAY8 (8-bit grayscale) data. We do this by adding the following code to the class_init function.

let caps = gst::Caps::new_simple( "video/x-raw", &[ ( "format", &gst::List::new(&[ &gst_video::VideoFormat::Bgrx.to_string(), &gst_video::VideoFormat::Gray8.to_string(), ]), ), ("width", &gst::IntRange::<i32>::new(0, i32::MAX)), ("height", &gst::IntRange::<i32>::new(0, i32::MAX)), ( "framerate", &gst::FractionRange::new( gst::Fraction::new(0, 1), gst::Fraction::new(i32::MAX, 1), ), ), ], ); let src_pad_template = gst::PadTemplate::new( "src", gst::PadDirection::Src, gst::PadPresence::Always, &caps, ); klass.add_pad_template(src_pad_template); let caps = gst::Caps::new_simple( "video/x-raw", &[ ("format", &gst_video::VideoFormat::Bgrx.to_string()), ("width", &gst::IntRange::<i32>::new(0, i32::MAX)), ("height", &gst::IntRange::<i32>::new(0, i32::MAX)), ( "framerate", &gst::FractionRange::new( gst::Fraction::new(0, 1), gst::Fraction::new(i32::MAX, 1), ), ), ], ); let sink_pad_template = gst::PadTemplate::new( "sink", gst::PadDirection::Sink, gst::PadPresence::Always, &caps, ); klass.add_pad_template(sink_pad_template);

The names “src” and “sink” are pre-defined by the BaseTransform class and this base-class will also create the actual pads with those names from the templates for us whenever a new element instance is created. Otherwise we would have to do that in our new function but here this is not needed.

If you now run gst-inspect-1.0 on the rsrgb2gray element, these pad templates with their caps should also show up.

Caps Handling Part 1

As a next step we will add caps handling to our new element. This involves overriding 4 virtual methods from the BaseTransformImpl trait, and actually storing the configured input and output caps inside our element struct. Let’s start with the latter

struct State { in_info: gst_video::VideoInfo, out_info: gst_video::VideoInfo, } struct Rgb2Gray { cat: gst::DebugCategory, state: Mutex<Option<State>>, } impl Rgb2Gray { fn new(_transform: &BaseTransform) -> Box<BaseTransformImpl<BaseTransform>> { Box::new(Self { cat: gst::DebugCategory::new( "rsrgb2gray", gst::DebugColorFlags::empty(), "Rust RGB-GRAY converter", ), state: Mutex::new(None), }) } }

We define a new struct State that contains the input and output caps, stored in a VideoInfo. VideoInfo is a struct that contains various fields like width/height, framerate and the video format and allows to conveniently with the properties of (raw) video formats. We have to store it inside a Mutex in our Rgb2Gray struct as this can (in theory) be accessed from multiple threads at the same time.

Whenever input/output caps are configured on our element, the set_caps virtual method of BaseTransform is called with both caps (i.e. in the very beginning before the data flow and whenever it changes), and all following video frames that pass through our element should be according to those caps. Once the element is shut down, the stop virtual method is called and it would make sense to release the State as it only contains stream-specific information. We’re doing this by adding the following to the BaseTransformImpl trait implementation

impl BaseTransformImpl<BaseTransform> for Rgb2Gray { fn set_caps(&self, element: &BaseTransform, incaps: &gst::Caps, outcaps: &gst::Caps) -> bool { let in_info = match gst_video::VideoInfo::from_caps(incaps) { None => return false, Some(info) => info, }; let out_info = match gst_video::VideoInfo::from_caps(outcaps) { None => return false, Some(info) => info, }; gst_debug!( self.cat, obj: element, "Configured for caps {} to {}", incaps, outcaps ); *self.state.lock().unwrap() = Some(State { in_info: in_info, out_info: out_info, }); true } fn stop(&self, element: &BaseTransform) -> bool { // Drop state let _ = self.state.lock().unwrap().take(); gst_info!(self.cat, obj: element, "Stopped"); true } }

This code should be relatively self-explanatory. In set_caps we’re parsing the two caps into a VideoInfo and then store this in our State, in stop we drop the State and replace it with None. In addition we make use of our debug category here and use the gst_info! and gst_debug! macros to output the current caps configuration to the GStreamer debug logging system. This information can later be useful for debugging any problems once the element is running.

Next we have to provide information to the BaseTransform base class about the size in bytes of a video frame with specific caps. This is needed so that the base class can allocate an appropriately sized output buffer for us, that we can then fill later. This is done with the get_unit_size virtual method, which is required to return the size of one processing unit in specific caps. In our case, one processing unit is one video frame. In the case of raw audio it would be the size of one sample multiplied by the number of channels.

impl BaseTransformImpl<BaseTransform> for Rgb2Gray { fn get_unit_size(&self, _element: &BaseTransform, caps: &gst::Caps) -> Option<usize> { gst_video::VideoInfo::from_caps(caps).map(|info| info.size()) } }

We simply make use of the VideoInfo API here again, which conveniently gives us the size of one video frame already.

Instead of get_unit_size it would also be possible to implement the transform_size virtual method, which is getting passed one size and the corresponding caps, another caps and is supposed to return the size converted to the second caps. Depending on how your element works, one or the other can be easier to implement.

Caps Handling Part 2

We’re not done yet with caps handling though. As a very last step it is required that we implement a function that is converting caps into the corresponding caps in the other direction. For example, if we receive BGRx caps with some width/height on the sinkpad, we are supposed to convert this into new caps with the same width/height but BGRx or GRAY8. That is, we can convert BGRx to BGRx or GRAY8. Similarly, if the element downstream of ours can accept GRAY8 with a specific width/height from our source pad, we have to convert this to BGRx with that very same width/height.

This has to be implemented in the transform_caps virtual method, and looks as following

impl BaseTransformImpl<BaseTransform> for Rgb2Gray { fn transform_caps( &self, element: &BaseTransform, direction: gst::PadDirection, caps: gst::Caps, filter: Option<&gst::Caps>, ) -> gst::Caps { let other_caps = if direction == gst::PadDirection::Src { let mut caps = caps.clone(); for s in caps.make_mut().iter_mut() { s.set("format", &gst_video::VideoFormat::Bgrx.to_string()); } caps } else { let mut gray_caps = gst::Caps::new_empty(); { let gray_caps = gray_caps.get_mut().unwrap(); for s in caps.iter() { let mut s_gray = s.to_owned(); s_gray.set("format", &gst_video::VideoFormat::Gray8.to_string()); gray_caps.append_structure(s_gray); } gray_caps.append(caps.clone()); } gray_caps }; gst_debug!( self.cat, obj: element, "Transformed caps from {} to {} in direction {:?}", caps, other_caps, direction ); if let Some(filter) = filter { filter.intersect_with_mode(&other_caps, gst::CapsIntersectMode::First) } else { other_caps } } }

This caps conversion happens in 3 steps. First we check if we got caps for the source pad. In that case, the caps on the other pad (the sink pad) are going to be exactly the same caps but no matter if the caps contained BGRx or GRAY8 they must become BGRx as that’s the only format that our sink pad can accept. We do this by creating a clone of the input caps, then making sure that those caps are actually writable (i.e. we’re having the only reference to them, or a copy is going to be created) and then iterate over all the structures inside the caps and then set the “format” field to BGRx. After this, all structures in the new caps will be with the format field set to BGRx.

Similarly, if we get caps for the sink pad and are supposed to convert it to caps for the source pad, we create new caps and in there append a copy of each structure of the input caps (which are BGRx) with the format field set to GRAY8. In the end we append the original caps, giving us first all caps as GRAY8 and then the same caps as BGRx. With this ordering we signal to GStreamer that we would prefer to output GRAY8 over BGRx.

In the end the caps we created for the other pad are filtered against optional filter caps to reduce the potential size of the caps. This is done by intersecting the caps with that filter, while keeping the order (and thus preferences) of the filter caps (gst::CapsIntersectMode::First).

Conversion of BGRx Video Frames to Grayscale

Now that all the caps handling is implemented, we can finally get to the implementation of the actual video frame conversion. For this we start with defining a helper function bgrx_to_gray that converts one BGRx pixel to a grayscale value. The BGRx pixel is passed as a &[u8] slice with 4 elements and the function returns another u8 for the grayscale value.

impl Rgb2Gray { #[inline] fn bgrx_to_gray(in_p: &[u8]) -> u8 { // See https://en.wikipedia.org/wiki/YUV#SDTV_with_BT.601 const R_Y: u32 = 19595; // 0.299 * 65536 const G_Y: u32 = 38470; // 0.587 * 65536 const B_Y: u32 = 7471; // 0.114 * 65536 assert_eq!(in_p.len(), 4); let b = u32::from(in_p[0]); let g = u32::from(in_p[1]); let r = u32::from(in_p[2]); let gray = ((r * R_Y) + (g * G_Y) + (b * B_Y)) / 65536; (gray as u8) } }

This function works by extracting the blue, green and red components from each pixel (remember: we work on BGRx, so the first value will be blue, the second green, the third red and the fourth unused), extending it from 8 to 32 bits for a wider value-range and then converts it to the Y component of the YUV colorspace (basically what your grandparents’ black & white TV would’ve displayed). The coefficients come from the Wikipedia page about YUV and are normalized to unsigned 16 bit integers so we can keep some accuracy, don’t have to work with floating point arithmetic and stay inside the range of 32 bit integers for all our calculations. As you can see, the green component is weighted more than the others, which comes from our eyes being more sensitive to green than to other colors.

Note: This is only doing the actual conversion from linear RGB to grayscale (and in BT.601 colorspace). To do this conversion correctly you need to know your colorspaces and use the correct coefficients for conversion, and also do gamma correction. See this about why it is important.

Afterwards we have to actually call this function on every pixel. For this the transform virtual method is implemented, which gets a input and output buffer passed and we’re supposed to read the input buffer and fill the output buffer. The implementation looks as follows, and is going to be our biggest function for this element

impl BaseTransformImpl<BaseTransform> for Rgb2Gray { fn transform( &self, element: &BaseTransform, inbuf: &gst::Buffer, outbuf: &mut gst::BufferRef, ) -> gst::FlowReturn { let mut state_guard = self.state.lock().unwrap(); let state = match *state_guard { None => { gst_element_error!(element, gst::CoreError::Negotiation, ["Have no state yet"]); return gst::FlowReturn::NotNegotiated; } Some(ref mut state) => state, }; let in_frame = match gst_video::VideoFrameRef::from_buffer_ref_readable( inbuf.as_ref(), &state.in_info, ) { None => { gst_element_error!( element, gst::CoreError::Failed, ["Failed to map input buffer readable"] ); return gst::FlowReturn::Error; } Some(in_frame) => in_frame, }; let mut out_frame = match gst_video::VideoFrameRef::from_buffer_ref_writable(outbuf, &state.out_info) { None => { gst_element_error!( element, gst::CoreError::Failed, ["Failed to map output buffer writable"] ); return gst::FlowReturn::Error; } Some(out_frame) => out_frame, }; let width = in_frame.width() as usize; let in_stride = in_frame.plane_stride()[0] as usize; let in_data = in_frame.plane_data(0).unwrap(); let out_stride = out_frame.plane_stride()[0] as usize; let out_format = out_frame.format(); let out_data = out_frame.plane_data_mut(0).unwrap(); if out_format == gst_video::VideoFormat::Bgrx { assert_eq!(in_data.len() % 4, 0); assert_eq!(out_data.len() % 4, 0); assert_eq!(out_data.len() / out_stride, in_data.len() / in_stride); let in_line_bytes = width * 4; let out_line_bytes = width * 4; assert!(in_line_bytes <= in_stride); assert!(out_line_bytes <= out_stride); for (in_line, out_line) in in_data .chunks(in_stride) .zip(out_data.chunks_mut(out_stride)) { for (in_p, out_p) in in_line[..in_line_bytes] .chunks(4) .zip(out_line[..out_line_bytes].chunks_mut(4)) { assert_eq!(out_p.len(), 4); let gray = Rgb2Gray::bgrx_to_gray(in_p); out_p[0] = gray; out_p[1] = gray; out_p[2] = gray; } } } else if out_format == gst_video::VideoFormat::Gray8 { assert_eq!(in_data.len() % 4, 0); assert_eq!(out_data.len() / out_stride, in_data.len() / in_stride); let in_line_bytes = width * 4; let out_line_bytes = width; assert!(in_line_bytes <= in_stride); assert!(out_line_bytes <= out_stride); for (in_line, out_line) in in_data .chunks(in_stride) .zip(out_data.chunks_mut(out_stride)) { for (in_p, out_p) in in_line[..in_line_bytes] .chunks(4) .zip(out_line[..out_line_bytes].iter_mut()) { let gray = Rgb2Gray::bgrx_to_gray(in_p); *out_p = gray; } } } else { unimplemented!(); } gst::FlowReturn::Ok } }

What happens here is that we first of all lock our state (the input/output VideoInfo) and error out if we don’t have any yet (which can’t really happen unless other elements have a bug, but better safe than sorry). After that we map the input buffer readable and the output buffer writable with the VideoFrameRef API. By mapping the buffers we get access to the underlying bytes of them, and the mapping operation could for example make GPU memory available or just do nothing and give us access to a normally allocated memory area. We have access to the bytes of the buffer until the VideoFrameRef goes out of scope.

Instead of VideoFrameRef we could’ve also used the gst::Buffer::map_readable() and gst::Buffer::map_writable() API, but different to those the VideoFrameRef API also extracts various metadata from the raw video buffers and makes them available. For example we can directly access the different planes as slices without having to calculate the offsets ourselves, or we get directly access to the width and height of the video frame.

After mapping the buffers, we store various information we’re going to need later in local variables to save some typing later. This is the width (same for input and output as we never changed the width in transform_caps), the input and out (row-) stride (the number of bytes per row/line, which possibly includes some padding at the end of each line for alignment reasons), the output format (which can be BGRx or GRAY8 because of how we implemented transform_caps) and the pointers to the first plane of the input and output (which in this case also is the only plane, BGRx and GRAY8 both have only a single plane containing all the RGB/gray components).

Then based on whether the output is BGRx or GRAY8, we iterate over all pixels. The code is basically the same in both cases, so I’m only going to explain the case where BGRx is output.

We start by iterating over each line of the input and output, and do so by using the chunks iterator to give us chunks of as many bytes as the (row-) stride of the video frame is, do the same for the other frame and then zip both iterators together. This means that on each iteration we get exactly one line as a slice from each of the frames and can then start accessing the actual pixels in each line.

To access the individual pixels in each line, we again use the chunks iterator the same way, but this time to always give us chunks of 4 bytes from each line. As BGRx uses 4 bytes for each pixel, this gives us exactly one pixel. Instead of iterating over the whole line, we only take the actual sub-slice that contains the pixels, not the whole line with stride number of bytes containing potential padding at the end. Now for each of these pixels we call our previously defined bgrx_to_gray function and then fill the B, G and R components of the output buffer with that value to get grayscale output. And that’s all.

Using Rust high-level abstractions like the chunks iterators and bounds-checking slice accesses might seem like it’s going to cause quite some performance penalty, but if you look at the generated assembly most of the bounds checks are completely optimized away and the resulting assembly code is close to what one would’ve written manually (especially when using the newly-added exact_chunks iterators). Here you’re getting safe and high-level looking code with low-level performance!

You might’ve also noticed the various assertions in the processing function. These are there to give further hints to the compiler about properties of the code, and thus potentially being able to optimize the code better and moving e.g. bounds checks out of the inner loop and just having the assertion outside the loop check for the same. In Rust adding assertions can often improve performance by allowing further optimizations to be applied, but in the end always check the resulting assembly to see if what you did made any difference.

Testing the new element

Now we implemented almost all functionality of our new element and could run it on actual video data. This can be done now with the gst-launch-1.0 tool, or any application using GStreamer and allowing us to insert our new element somewhere in the video part of the pipeline. With gst-launch-1.0 you could run for example the following pipelines

# Run on a test pattern gst-launch-1.0 videotestsrc ! rsrgb2gray ! videoconvert ! autovideosink # Run on some video file, also playing the audio gst-launch-1.0 playbin uri=file:///path/to/some/file video-filter=rsrgb2gray

Note that you will likely want to compile with cargo build –release and add the target/release directory to GST_PLUGIN_PATH instead. The debug build might be too slow, and generally the release builds are multiple orders of magnitude (!) faster.

Properties

The only feature missing now are the properties I mentioned in the opening paragraph: one boolean property to invert the grayscale value and one integer property to shift the value by up to 255. Implementing this on top of the previous code is not a lot of work. Let’s start with defining a struct for holding the property values and defining the property metadata.

const DEFAULT_INVERT: bool = false; const DEFAULT_SHIFT: u32 = 0; #[derive(Debug, Clone, Copy)] struct Settings { invert: bool, shift: u32, } impl Default for Settings { fn default() -> Self { Settings { invert: DEFAULT_INVERT, shift: DEFAULT_SHIFT, } } } static PROPERTIES: [Property; 2] = [ Property::Boolean( "invert", "Invert", "Invert grayscale output", DEFAULT_INVERT, PropertyMutability::ReadWrite, ), Property::UInt( "shift", "Shift", "Shift grayscale output (wrapping around)", (0, 255), DEFAULT_SHIFT, PropertyMutability::ReadWrite, ), ]; struct Rgb2Gray { cat: gst::DebugCategory, settings: Mutex<Settings>, state: Mutex<Option<State>>, } impl Rgb2Gray { fn new(_transform: &BaseTransform) -> Box<BaseTransformImpl<BaseTransform>> { Box::new(Self { cat: gst::DebugCategory::new( "rsrgb2gray", gst::DebugColorFlags::empty(), "Rust RGB-GRAY converter", ), settings: Mutex::new(Default::default()), state: Mutex::new(None), }) } }

This should all be rather straightforward: we define a Settings struct that stores the two values, implement the Default trait for it, then define a two-element array with property metadata (names, description, ranges, default value, writability), and then store the default value of our Settings struct inside another Mutex inside the element struct.

In the next step we have to make use of these: we need to tell the GObject type system about the properties, and we need to implement functions that are called whenever a property value is set or get.

impl Rgb2Gray { fn class_init(klass: &mut BaseTransformClass) { [...] klass.install_properties(&PROPERTIES); [...] } } impl ObjectImpl<BaseTransform> for Rgb2Gray { fn set_property(&self, obj: &glib::Object, id: u32, value: &glib::Value) { let prop = &PROPERTIES[id as usize]; let element = obj.clone().downcast::<BaseTransform>().unwrap(); match *prop { Property::Boolean("invert", ..) => { let mut settings = self.settings.lock().unwrap(); let invert = value.get().unwrap(); gst_info!( self.cat, obj: &element, "Changing invert from {} to {}", settings.invert, invert ); settings.invert = invert; } Property::UInt("shift", ..) => { let mut settings = self.settings.lock().unwrap(); let shift = value.get().unwrap(); gst_info!( self.cat, obj: &element, "Changing shift from {} to {}", settings.shift, shift ); settings.shift = shift; } _ => unimplemented!(), } } fn get_property(&self, _obj: &glib::Object, id: u32) -> Result<glib::Value, ()> { let prop = &PROPERTIES[id as usize]; match *prop { Property::Boolean("invert", ..) => { let settings = self.settings.lock().unwrap(); Ok(settings.invert.to_value()) } Property::UInt("shift", ..) => { let settings = self.settings.lock().unwrap(); Ok(settings.shift.to_value()) } _ => unimplemented!(), } } }

Property values can be changed from any thread at any time, that’s why the Mutex is needed here to protect our struct. And we’re using a new mutex to be able to have it locked only for the shorted possible amount of time: we don’t want to keep it locked for the whole time of the transform function, otherwise applications trying to set/get values would block for up to one frame.

In the property setter/getter functions we are working with a glib::Value. This is a dynamically typed value type that can contain values of any type, together with the type information of the contained value. Here we’re using it to handle an unsigned integer (u32) and a boolean for our two properties. To know which property is currently set/get, we get an identifier passed which is the index into our PROPERTIES array. We then simply match on the name of that to decide which property was meant

With this implemented, we can already compile everything, see the properties and their metadata in gst-inspect-1.0 and can also set them on gst-launch-1.0 like this

# Set invert to true and shift to 128 gst-launch-1.0 videotestsrc ! rsrgb2gray invert=true shift=128 ! videoconvert ! autovideosink

If we set GST_DEBUG=rsrgb2gray:6 in the environment before running that, we can also see the corresponding debug output when the values are changing. The only thing missing now is to actually make use of the property values for the processing. For this we add the following changes to bgrx_to_gray and the transform function

impl Rgb2Gray { #[inline] fn bgrx_to_gray(in_p: &[u8], shift: u8, invert: bool) -> u8 { [...] let gray = ((r * R_Y) + (g * G_Y) + (b * B_Y)) / 65536; let gray = (gray as u8).wrapping_add(shift); if invert { 255 - gray } else { gray } } } impl BaseTransformImpl<BaseTransform> for Rgb2Gray { fn transform( &self, element: &BaseTransform, inbuf: &gst::Buffer, outbuf: &mut gst::BufferRef, ) -> gst::FlowReturn { let settings = *self.settings.lock().unwrap(); [...] let gray = Rgb2Gray::bgrx_to_gray(in_p, settings.shift as u8, settings.invert); [...] } }

And that’s all. If you run the element in gst-launch-1.0 and change the values of the properties you should also see the corresponding changes in the video output.

Note that we always take a copy of the Settings struct at the beginning of the transform function. This ensures that we take the mutex only the shorted possible amount of time and then have a local snapshot of the settings for each frame.

Also keep in mind that the usage of the property values in the bgrx_to_gray function is far from optimal. It means the addition of another condition to the calculation of each pixel, thus potentially slowing it down a lot. Ideally this condition would be moved outside the inner loops and the bgrx_to_gray function would made generic over that. See for example this blog post about “branchless Rust” for ideas how to do that, the actual implementation is left as an exercise for the reader.

What next?

I hope the code walkthrough above was useful to understand how to implement GStreamer plugins and elements in Rust. If you have any questions, feel free to ask them here in the comments.

The same approach also works for audio filters or anything that can be handled in some way with the API of the BaseTransform base class. You can find another filter, an audio echo filter, using the same approach here.

In the next blog post in this series I’ll show how to use another base class to implement another kind of element, but for the time being you can also check the GIT repository for various other element implementations.

Sebastian Dröge: How to write GStreamer Elements in Rust Part 1: A Video Filter for converting RGB to grayscale

Planet Ubuntu - Sht, 13/01/2018 - 11:23md

This is part one of a series of blog posts that I’ll write in the next weeks, as previously announced in the GStreamer Rust bindings 0.10.0 release blog post. Since the last series of blog posts about writing GStreamer plugins in Rust ([1] [2] [3] [4]) a lot has changed, and the content of those blog posts has only historical value now, as the journey of experimentation to what exists now.

In this first part we’re going to write a plugin that contains a video filter element. The video filter can convert from RGB to grayscale, either output as 8-bit per pixel grayscale or 32-bit per pixel RGB. In addition there’s a property to invert all grayscale values, or to shift them by up to 255 values. In the end this will allow you to watch Big Bucky Bunny, or anything else really that can somehow go into a GStreamer pipeline, in grayscale. Or encode the output to a new video file, send it over the network via WebRTC or something else, or basically do anything you want with it.

Big Bucky Bunny – Grayscale

This will show the basics of how to write a GStreamer plugin and element in Rust: the basic setup for registering a type and implementing it in Rust, and how to use the various GStreamer API and APIs from the Rust standard library to do the processing.

The final code for this plugin can be found here, and it is based on the 0.1 version of the gst-plugin crate and the 0.10 version of the gstreamer crate. At least Rust 1.20 is required for all this. I’m also assuming that you have GStreamer (at least version 1.8) installed for your platform, see e.g. the GStreamer bindings installation instructions.

Table of Contents
  1. Project Structure
  2. Plugin Initialization
  3. Type Registration
  4. Type Class & Instance Initialization
  5. Caps & Pad Templates
  6. Caps Handling Part 1
  7. Caps Handling Part 2
  8. Conversion of BGRx Video Frames to Grayscale
  9. Testing the new element
  10. Properties
  11. What next?
Project Structure

We’ll create a new cargo project with cargo init –lib –name gst-plugin-tutorial. This will create a basically empty Cargo.toml and a corresponding src/lib.rs. We will use this structure: lib.rs will contain all the plugin related code, separate modules will contain any GStreamer plugins that are added.

The empty Cargo.toml has to be updated to list all the dependencies that we need, and to define that the crate should result in a cdylib, i.e. a C library that does not contain any Rust-specific metadata. The final Cargo.toml looks as follows

[package] name = "gst-plugin-tutorial" version = "0.1.0" authors = ["Sebastian Dröge <sebastian@centricular.com>"] repository = "https://github.com/sdroege/gst-plugin-rs" license = "MIT/Apache-2.0" [dependencies] glib = "0.4" gstreamer = "0.10" gstreamer-base = "0.10" gstreamer-video = "0.10" gst-plugin = "0.1" [lib] name = "gstrstutorial" crate-type = ["cdylib"] path = "src/lib.rs"

We’re depending on the gst-plugin crate, which provides all the basic infrastructure for implementing GStreamer plugins and elements. In addition we depend on the gstreamer, gstreamer-base and gstreamer-video crates for various GStreamer API that we’re going to use later, and the glib crate to be able to use some GLib API that we’ll need. GStreamer is building upon GLib, and this leaks through in various places.

With the basic project structure being set-up, we should be able to compile the project with cargo build now, which will download and build all dependencies and then creates a file called target/debug/libgstrstutorial.so (or .dll on Windows, .dylib on macOS). This is going to be our GStreamer plugin.

To allow GStreamer to find our new plugin and make it available in every GStreamer-based application, we could install it into the system- or user-wide GStreamer plugin path or simply point the GST_PLUGIN_PATH environment variable to the directory containing it:

export GST_PLUGIN_PATH=`pwd`/target/debug

If you now run the gst-inspect-1.0 tool on the libgstrstutorial.so, it will not yet print all information it can extract from the plugin but for now just complains that this is not a valid GStreamer plugin. Which is true, we didn’t write any code for it yet.

Plugin Initialization

Let’s start editing src/lib.rs to make this an actual GStreamer plugin. First of all, we need to add various extern crate directives to be able to use our dependencies and also mark some of them #[macro_use] because we’re going to use macros defined in some of them. This looks like the following

extern crate glib; #[macro_use] extern crate gstreamer as gst; extern crate gstreamer_base as gst_base; extern crate gstreamer_video as gst_video; #[macro_use] extern crate gst_plugin;

Next we make use of the plugin_define! macro from the gst-plugin crate to set-up the static metadata of the plugin (and make the shared library recognizeable by GStreamer to be a valid plugin), and to define the name of our entry point function (plugin_init) where we will register all the elements that this plugin provides.

plugin_define!( b"rstutorial\0", b"Rust Tutorial Plugin\0", plugin_init, b"1.0\0", b"MIT/X11\0", b"rstutorial\0", b"rstutorial\0", b"https://github.com/sdroege/gst-plugin-rs\0", b"2017-12-30\0" );

This is unfortunately not very beautiful yet due to a) GStreamer requiring this information to be statically available in the shared library, not returned by a function (starting with GStreamer 1.14 it can be a function), and b) Rust not allowing raw strings (b”blabla) to be concatenated with a macro like the std::concat macro (so that the b and \0 parts could be hidden away). Expect this to become better in the future.

The static plugin metadata that we provide here is

  1. name of the plugin
  2. short description for the plugin
  3. name of the plugin entry point function
  4. version number of the plugin
  5. license of the plugin (only a fixed set of licenses is allowed here, see)
  6. source package name
  7. binary package name (only really makes sense for e.g. Linux distributions)
  8. origin of the plugin
  9. release date of this version

In addition we’re defining an empty plugin entry point function that just returns true

fn plugin_init(plugin: &gst::Plugin) -> bool { true }

With all that given, gst-inspect-1.0 should print exactly this information when running on the libgstrstutorial.so file (or .dll on Windows, or .dylib on macOS)

gst-inspect-1.0 target/debug/libgstrstutorial.so

Type Registration

As a next step, we’re going to add another module rgb2gray to our project, and call a function called register from our plugin_init function.

mod rgb2gray; fn plugin_init(plugin: &gst::Plugin) -> bool { rgb2gray::register(plugin); true }

With that our src/lib.rs is complete, and all following code is only in src/rgb2gray.rs. At the top of the new file we first need to add various use-directives to import various types and functions we’re going to use into the current module’s scope

use glib; use gst; use gst::prelude::*; use gst_video; use gst_plugin::properties::*; use gst_plugin::object::*; use gst_plugin::element::*; use gst_plugin::base_transform::*; use std::i32; use std::sync::Mutex;

GStreamer is based on the GLib object system (GObject). C (just like Rust) does not have built-in support for object orientated programming, inheritance, virtual methods and related concepts, and GObject makes these features available in C as a library. Without language support this is a quite verbose endeavour in C, and the gst-plugin crate tries to expose all this in a (as much as possible) Rust-style API while hiding all the details that do not really matter.

So, as a next step we need to register a new type for our RGB to Grayscale converter GStreamer element with the GObject type system, and then register that type with GStreamer to be able to create new instances of it. We do this with the following code

struct Rgb2GrayStatic; impl ImplTypeStatic<BaseTransform> for Rgb2GrayStatic { fn get_name(&self) -> &str { "Rgb2Gray" } fn new(&self, element: &BaseTransform) -> Box<BaseTransformImpl<BaseTransform>> { Rgb2Gray::new(element) } fn class_init(&self, klass: &mut BaseTransformClass) { Rgb2Gray::class_init(klass); } } pub fn register(plugin: &gst::Plugin) { let type_ = register_type(Rgb2GrayStatic); gst::Element::register(plugin, "rsrgb2gray", 0, type_); }

This defines a zero-sized struct Rgb2GrayStatic that is used to implement the ImplTypeStatic<BaseTransform> trait on it for providing static information about the type to the type system. In our case this is a zero-sized struct, but in other cases this struct might contain actual data (for example if the same element code is used for multiple elements, e.g. when wrapping a generic codec API that provides support for multiple decoders and then wanting to register one element per decoder). By implementing ImplTypeStatic<BaseTransform> we also declare that our element is going to be based on the GStreamer BaseTransform base class, which provides a relatively simple API for 1:1 transformation elements like ours is going to be.

ImplTypeStatic provides functions that return a name for the type, and functions for initializing/returning a new instance of our element (new) and for initializing the class metadata (class_init, more on that later). We simply let those functions proxy to associated functions on the Rgb2Gray struct that we’re going to define at a later time.

In addition, we also define a register function (the one that is already called from our plugin_init function) and in there first register the Rgb2GrayStatic type metadata with the GObject type system to retrieve a type ID, and then register this type ID to GStreamer to be able to create new instances of it with the name “rsrgb2gray” (e.g. when using gst::ElementFactory::make).

Type Class & Instance Initialization

As a next step we declare the Rgb2Gray struct and implement the new and class_init functions on it. In the first version, this struct is almost empty but we will later use it to store all state of our element.

struct Rgb2Gray { cat: gst::DebugCategory, } impl Rgb2Gray { fn new(_transform: &BaseTransform) -> Box<BaseTransformImpl<BaseTransform>> { Box::new(Self { cat: gst::DebugCategory::new( "rsrgb2gray", gst::DebugColorFlags::empty(), "Rust RGB-GRAY converter", ), }) } fn class_init(klass: &mut BaseTransformClass) { klass.set_metadata( "RGB-GRAY Converter", "Filter/Effect/Converter/Video", "Converts RGB to GRAY or grayscale RGB", "Sebastian Dröge <sebastian@centricular.com>", ); klass.configure(BaseTransformMode::NeverInPlace, false, false); } }

In the new function we return a boxed (i.e. heap-allocated) version of our struct, containing a newly created GStreamer debug category of name “rsrgb2gray”. We’re going to use this debug category later for making use of GStreamer’s debug logging system for logging the state and changes of our element.

In the class_init function we, again, set up some metadata for our new element. In this case these are a description, a classification of our element, a longer description and the author. The metadata can later be retrieved and made use of via the Registry and PluginFeature/ElementFactory API. We also configure the BaseTransform class and define that we will never operate in-place (producing our output in the input buffer), and that we don’t want to work in passthrough mode if the input/output formats are the same.

Additionally we need to implement various traits on the Rgb2Gray struct, which will later be used to override virtual methods of the various parent classes of our element. For now we can keep the trait implementations empty. There is one trait implementation required per parent class.

impl ObjectImpl<BaseTransform> for Rgb2Gray {} impl ElementImpl<BaseTransform> for Rgb2Gray {} impl BaseTransformImpl<BaseTransform> for Rgb2Gray {}

With all this defined, gst-inspect-1.0 should be able to show some more information about our element already but will still complain that it’s not complete yet.

Caps & Pad Templates

Data flow of GStreamer elements is happening via pads, which are the input(s) and output(s) (or sinks and sources) of an element. Via the pads, buffers containing actual media data, events or queries are transferred. An element can have any number of sink and source pads, but our new element will only have one of each.

To be able to declare what kinds of pads an element can create (they are not necessarily all static but could be created at runtime by the element or the application), it is necessary to install so-called pad templates during the class initialization. These pad templates contain the name (or rather “name template”, it could be something like src_%u for e.g. pad templates that declare multiple possible pads), the direction of the pad (sink or source), the availability of the pad (is it always there, sometimes added/removed by the element or to be requested by the application) and all the possible media types (called caps) that the pad can consume (sink pads) or produce (src pads).

In our case we only have always pads, one sink pad called “sink”, on which we can only accept RGB (BGRx to be exact) data with any width/height/framerate and one source pad called “src”, on which we will produce either RGB (BGRx) data or GRAY8 (8-bit grayscale) data. We do this by adding the following code to the class_init function.

let caps = gst::Caps::new_simple( "video/x-raw", &[ ( "format", &gst::List::new(&[ &gst_video::VideoFormat::Bgrx.to_string(), &gst_video::VideoFormat::Gray8.to_string(), ]), ), ("width", &gst::IntRange::<i32>::new(0, i32::MAX)), ("height", &gst::IntRange::<i32>::new(0, i32::MAX)), ( "framerate", &gst::FractionRange::new( gst::Fraction::new(0, 1), gst::Fraction::new(i32::MAX, 1), ), ), ], ); let src_pad_template = gst::PadTemplate::new( "src", gst::PadDirection::Src, gst::PadPresence::Always, &caps, ); klass.add_pad_template(src_pad_template); let caps = gst::Caps::new_simple( "video/x-raw", &[ ("format", &gst_video::VideoFormat::Bgrx.to_string()), ("width", &gst::IntRange::<i32>::new(0, i32::MAX)), ("height", &gst::IntRange::<i32>::new(0, i32::MAX)), ( "framerate", &gst::FractionRange::new( gst::Fraction::new(0, 1), gst::Fraction::new(i32::MAX, 1), ), ), ], ); let sink_pad_template = gst::PadTemplate::new( "sink", gst::PadDirection::Sink, gst::PadPresence::Always, &caps, ); klass.add_pad_template(sink_pad_template);

The names “src” and “sink” are pre-defined by the BaseTransform class and this base-class will also create the actual pads with those names from the templates for us whenever a new element instance is created. Otherwise we would have to do that in our new function but here this is not needed.

If you now run gst-inspect-1.0 on the rsrgb2gray element, these pad templates with their caps should also show up.

Caps Handling Part 1

As a next step we will add caps handling to our new element. This involves overriding 4 virtual methods from the BaseTransformImpl trait, and actually storing the configured input and output caps inside our element struct. Let’s start with the latter

struct State { in_info: gst_video::VideoInfo, out_info: gst_video::VideoInfo, } struct Rgb2Gray { cat: gst::DebugCategory, state: Mutex<Option<State>>, } impl Rgb2Gray { fn new(_transform: &BaseTransform) -> Box<BaseTransformImpl<BaseTransform>> { Box::new(Self { cat: gst::DebugCategory::new( "rsrgb2gray", gst::DebugColorFlags::empty(), "Rust RGB-GRAY converter", ), state: Mutex::new(None), }) } }

We define a new struct State that contains the input and output caps, stored in a VideoInfo. VideoInfo is a struct that contains various fields like width/height, framerate and the video format and allows to conveniently with the properties of (raw) video formats. We have to store it inside a Mutex in our Rgb2Gray struct as this can (in theory) be accessed from multiple threads at the same time.

Whenever input/output caps are configured on our element, the set_caps virtual method of BaseTransform is called with both caps (i.e. in the very beginning before the data flow and whenever it changes), and all following video frames that pass through our element should be according to those caps. Once the element is shut down, the stop virtual method is called and it would make sense to release the State as it only contains stream-specific information. We’re doing this by adding the following to the BaseTransformImpl trait implementation

impl BaseTransformImpl<BaseTransform> for Rgb2Gray { fn set_caps(&self, element: &BaseTransform, incaps: &gst::Caps, outcaps: &gst::Caps) -> bool { let in_info = match gst_video::VideoInfo::from_caps(incaps) { None => return false, Some(info) => info, }; let out_info = match gst_video::VideoInfo::from_caps(outcaps) { None => return false, Some(info) => info, }; gst_debug!( self.cat, obj: element, "Configured for caps {} to {}", incaps, outcaps ); *self.state.lock().unwrap() = Some(State { in_info: in_info, out_info: out_info, }); true } fn stop(&self, element: &BaseTransform) -> bool { // Drop state let _ = self.state.lock().unwrap().take(); gst_info!(self.cat, obj: element, "Stopped"); true } }

This code should be relatively self-explanatory. In set_caps we’re parsing the two caps into a VideoInfo and then store this in our State, in stop we drop the State and replace it with None. In addition we make use of our debug category here and use the gst_info! and gst_debug! macros to output the current caps configuration to the GStreamer debug logging system. This information can later be useful for debugging any problems once the element is running.

Next we have to provide information to the BaseTransform base class about the size in bytes of a video frame with specific caps. This is needed so that the base class can allocate an appropriately sized output buffer for us, that we can then fill later. This is done with the get_unit_size virtual method, which is required to return the size of one processing unit in specific caps. In our case, one processing unit is one video frame. In the case of raw audio it would be the size of one sample multiplied by the number of channels.

impl BaseTransformImpl<BaseTransform> for Rgb2Gray { fn get_unit_size(&self, _element: &BaseTransform, caps: &gst::Caps) -> Option<usize> { gst_video::VideoInfo::from_caps(caps).map(|info| info.size()) } }

We simply make use of the VideoInfo API here again, which conveniently gives us the size of one video frame already.

Instead of get_unit_size it would also be possible to implement the transform_size virtual method, which is getting passed one size and the corresponding caps, another caps and is supposed to return the size converted to the second caps. Depending on how your element works, one or the other can be easier to implement.

Caps Handling Part 2

We’re not done yet with caps handling though. As a very last step it is required that we implement a function that is converting caps into the corresponding caps in the other direction. For example, if we receive BGRx caps with some width/height on the sinkpad, we are supposed to convert this into new caps with the same width/height but BGRx or GRAY8. That is, we can convert BGRx to BGRx or GRAY8. Similarly, if the element downstream of ours can accept GRAY8 with a specific width/height from our source pad, we have to convert this to BGRx with that very same width/height.

This has to be implemented in the transform_caps virtual method, and looks as following

impl BaseTransformImpl<BaseTransform> for Rgb2Gray { fn transform_caps( &self, element: &BaseTransform, direction: gst::PadDirection, caps: gst::Caps, filter: Option<&gst::Caps>, ) -> gst::Caps { let other_caps = if direction == gst::PadDirection::Src { let mut caps = caps.clone(); for s in caps.make_mut().iter_mut() { s.set("format", &gst_video::VideoFormat::Bgrx.to_string()); } caps } else { let mut gray_caps = gst::Caps::new_empty(); { let gray_caps = gray_caps.get_mut().unwrap(); for s in caps.iter() { let mut s_gray = s.to_owned(); s_gray.set("format", &gst_video::VideoFormat::Gray8.to_string()); gray_caps.append_structure(s_gray); } gray_caps.append(caps.clone()); } gray_caps }; gst_debug!( self.cat, obj: element, "Transformed caps from {} to {} in direction {:?}", caps, other_caps, direction ); if let Some(filter) = filter { filter.intersect_with_mode(&other_caps, gst::CapsIntersectMode::First) } else { other_caps } } }

This caps conversion happens in 3 steps. First we check if we got caps for the source pad. In that case, the caps on the other pad (the sink pad) are going to be exactly the same caps but no matter if the caps contained BGRx or GRAY8 they must become BGRx as that’s the only format that our sink pad can accept. We do this by creating a clone of the input caps, then making sure that those caps are actually writable (i.e. we’re having the only reference to them, or a copy is going to be created) and then iterate over all the structures inside the caps and then set the “format” field to BGRx. After this, all structures in the new caps will be with the format field set to BGRx.

Similarly, if we get caps for the sink pad and are supposed to convert it to caps for the source pad, we create new caps and in there append a copy of each structure of the input caps (which are BGRx) with the format field set to GRAY8. In the end we append the original caps, giving us first all caps as GRAY8 and then the same caps as BGRx. With this ordering we signal to GStreamer that we would prefer to output GRAY8 over BGRx.

In the end the caps we created for the other pad are filtered against optional filter caps to reduce the potential size of the caps. This is done by intersecting the caps with that filter, while keeping the order (and thus preferences) of the filter caps (gst::CapsIntersectMode::First).

Conversion of BGRx Video Frames to Grayscale

Now that all the caps handling is implemented, we can finally get to the implementation of the actual video frame conversion. For this we start with defining a helper function bgrx_to_gray that converts one BGRx pixel to a grayscale value. The BGRx pixel is passed as a &[u8] slice with 4 elements and the function returns another u8 for the grayscale value.

impl Rgb2Gray { #[inline] fn bgrx_to_gray(in_p: &[u8]) -> u8 { // See https://en.wikipedia.org/wiki/YUV#SDTV_with_BT.601 const R_Y: u32 = 19595; // 0.299 * 65536 const G_Y: u32 = 38470; // 0.587 * 65536 const B_Y: u32 = 7471; // 0.114 * 65536 assert_eq!(in_p.len(), 4); let b = u32::from(in_p[0]); let g = u32::from(in_p[1]); let r = u32::from(in_p[2]); let gray = ((r * R_Y) + (g * G_Y) + (b * B_Y)) / 65536; (gray as u8) } }

This function works by extracting the blue, green and red components from each pixel (remember: we work on BGRx, so the first value will be blue, the second green, the third red and the fourth unused), extending it from 8 to 32 bits for a wider value-range and then converts it to the Y component of the YUV colorspace (basically what your grandparents’ black & white TV would’ve displayed). The coefficients come from the Wikipedia page about YUV and are normalized to unsigned 16 bit integers so we can keep some accuracy, don’t have to work with floating point arithmetic and stay inside the range of 32 bit integers for all our calculations. As you can see, the green component is weighted more than the others, which comes from our eyes being more sensitive to green than to other colors.

Note: This is only doing the actual conversion from linear RGB to grayscale (and in BT.601 colorspace). To do this conversion correctly you need to know your colorspaces and use the correct coefficients for conversion, and also do gamma correction. See this about why it is important.

Afterwards we have to actually call this function on every pixel. For this the transform virtual method is implemented, which gets a input and output buffer passed and we’re supposed to read the input buffer and fill the output buffer. The implementation looks as follows, and is going to be our biggest function for this element

impl BaseTransformImpl<BaseTransform> for Rgb2Gray { fn transform( &self, element: &BaseTransform, inbuf: &gst::Buffer, outbuf: &mut gst::BufferRef, ) -> gst::FlowReturn { let mut state_guard = self.state.lock().unwrap(); let state = match *state_guard { None => { gst_element_error!(element, gst::CoreError::Negotiation, ["Have no state yet"]); return gst::FlowReturn::NotNegotiated; } Some(ref mut state) => state, }; let in_frame = match gst_video::VideoFrameRef::from_buffer_ref_readable( inbuf.as_ref(), &state.in_info, ) { None => { gst_element_error!( element, gst::CoreError::Failed, ["Failed to map input buffer readable"] ); return gst::FlowReturn::Error; } Some(in_frame) => in_frame, }; let mut out_frame = match gst_video::VideoFrameRef::from_buffer_ref_writable(outbuf, &state.out_info) { None => { gst_element_error!( element, gst::CoreError::Failed, ["Failed to map output buffer writable"] ); return gst::FlowReturn::Error; } Some(out_frame) => out_frame, }; let width = in_frame.width() as usize; let in_stride = in_frame.plane_stride()[0] as usize; let in_data = in_frame.plane_data(0).unwrap(); let out_stride = out_frame.plane_stride()[0] as usize; let out_format = out_frame.format(); let out_data = out_frame.plane_data_mut(0).unwrap(); if out_format == gst_video::VideoFormat::Bgrx { assert_eq!(in_data.len() % 4, 0); assert_eq!(out_data.len() % 4, 0); assert_eq!(out_data.len() / out_stride, in_data.len() / in_stride); let in_line_bytes = width * 4; let out_line_bytes = width * 4; assert!(in_line_bytes <= in_stride); assert!(out_line_bytes <= out_stride); for (in_line, out_line) in in_data .chunks(in_stride) .zip(out_data.chunks_mut(out_stride)) { for (in_p, out_p) in in_line[..in_line_bytes] .chunks(4) .zip(out_line[..out_line_bytes].chunks_mut(4)) { assert_eq!(out_p.len(), 4); let gray = Rgb2Gray::bgrx_to_gray(in_p); out_p[0] = gray; out_p[1] = gray; out_p[2] = gray; } } } else if out_format == gst_video::VideoFormat::Gray8 { assert_eq!(in_data.len() % 4, 0); assert_eq!(out_data.len() / out_stride, in_data.len() / in_stride); let in_line_bytes = width * 4; let out_line_bytes = width; assert!(in_line_bytes <= in_stride); assert!(out_line_bytes <= out_stride); for (in_line, out_line) in in_data .chunks(in_stride) .zip(out_data.chunks_mut(out_stride)) { for (in_p, out_p) in in_line[..in_line_bytes] .chunks(4) .zip(out_line[..out_line_bytes].iter_mut()) { let gray = Rgb2Gray::bgrx_to_gray(in_p); *out_p = gray; } } } else { unimplemented!(); } gst::FlowReturn::Ok } }

What happens here is that we first of all lock our state (the input/output VideoInfo) and error out if we don’t have any yet (which can’t really happen unless other elements have a bug, but better safe than sorry). After that we map the input buffer readable and the output buffer writable with the VideoFrameRef API. By mapping the buffers we get access to the underlying bytes of them, and the mapping operation could for example make GPU memory available or just do nothing and give us access to a normally allocated memory area. We have access to the bytes of the buffer until the VideoFrameRef goes out of scope.

Instead of VideoFrameRef we could’ve also used the gst::Buffer::map_readable() and gst::Buffer::map_writable() API, but different to those the VideoFrameRef API also extracts various metadata from the raw video buffers and makes them available. For example we can directly access the different planes as slices without having to calculate the offsets ourselves, or we get directly access to the width and height of the video frame.

After mapping the buffers, we store various information we’re going to need later in local variables to save some typing later. This is the width (same for input and output as we never changed the width in transform_caps), the input and out (row-) stride (the number of bytes per row/line, which possibly includes some padding at the end of each line for alignment reasons), the output format (which can be BGRx or GRAY8 because of how we implemented transform_caps) and the pointers to the first plane of the input and output (which in this case also is the only plane, BGRx and GRAY8 both have only a single plane containing all the RGB/gray components).

Then based on whether the output is BGRx or GRAY8, we iterate over all pixels. The code is basically the same in both cases, so I’m only going to explain the case where BGRx is output.

We start by iterating over each line of the input and output, and do so by using the chunks iterator to give us chunks of as many bytes as the (row-) stride of the video frame is, do the same for the other frame and then zip both iterators together. This means that on each iteration we get exactly one line as a slice from each of the frames and can then start accessing the actual pixels in each line.

To access the individual pixels in each line, we again use the chunks iterator the same way, but this time to always give us chunks of 4 bytes from each line. As BGRx uses 4 bytes for each pixel, this gives us exactly one pixel. Instead of iterating over the whole line, we only take the actual sub-slice that contains the pixels, not the whole line with stride number of bytes containing potential padding at the end. Now for each of these pixels we call our previously defined bgrx_to_gray function and then fill the B, G and R components of the output buffer with that value to get grayscale output. And that’s all.

Using Rust high-level abstractions like the chunks iterators and bounds-checking slice accesses might seem like it’s going to cause quite some performance penalty, but if you look at the generated assembly most of the bounds checks are completely optimized away and the resulting assembly code is close to what one would’ve written manually (especially when using the newly-added exact_chunks iterators). Here you’re getting safe and high-level looking code with low-level performance!

You might’ve also noticed the various assertions in the processing function. These are there to give further hints to the compiler about properties of the code, and thus potentially being able to optimize the code better and moving e.g. bounds checks out of the inner loop and just having the assertion outside the loop check for the same. In Rust adding assertions can often improve performance by allowing further optimizations to be applied, but in the end always check the resulting assembly to see if what you did made any difference.

Testing the new element

Now we implemented almost all functionality of our new element and could run it on actual video data. This can be done now with the gst-launch-1.0 tool, or any application using GStreamer and allowing us to insert our new element somewhere in the video part of the pipeline. With gst-launch-1.0 you could run for example the following pipelines

# Run on a test pattern gst-launch-1.0 videotestsrc ! rsrgb2gray ! videoconvert ! autovideosink # Run on some video file, also playing the audio gst-launch-1.0 playbin uri=file:///path/to/some/file video-filter=rsrgb2gray

Note that you will likely want to compile with cargo build –release and add the target/release directory to GST_PLUGIN_PATH instead. The debug build might be too slow, and generally the release builds are multiple orders of magnitude (!) faster.

Properties

The only feature missing now are the properties I mentioned in the opening paragraph: one boolean property to invert the grayscale value and one integer property to shift the value by up to 255. Implementing this on top of the previous code is not a lot of work. Let’s start with defining a struct for holding the property values and defining the property metadata.

const DEFAULT_INVERT: bool = false; const DEFAULT_SHIFT: u32 = 0; #[derive(Debug, Clone, Copy)] struct Settings { invert: bool, shift: u32, } impl Default for Settings { fn default() -> Self { Settings { invert: DEFAULT_INVERT, shift: DEFAULT_SHIFT, } } } static PROPERTIES: [Property; 2] = [ Property::Boolean( "invert", "Invert", "Invert grayscale output", DEFAULT_INVERT, PropertyMutability::ReadWrite, ), Property::UInt( "shift", "Shift", "Shift grayscale output (wrapping around)", (0, 255), DEFAULT_SHIFT, PropertyMutability::ReadWrite, ), ]; struct Rgb2Gray { cat: gst::DebugCategory, settings: Mutex<Settings>, state: Mutex<Option<State>>, } impl Rgb2Gray { fn new(_transform: &BaseTransform) -> Box<BaseTransformImpl<BaseTransform>> { Box::new(Self { cat: gst::DebugCategory::new( "rsrgb2gray", gst::DebugColorFlags::empty(), "Rust RGB-GRAY converter", ), settings: Mutex::new(Default::default()), state: Mutex::new(None), }) } }

This should all be rather straightforward: we define a Settings struct that stores the two values, implement the Default trait for it, then define a two-element array with property metadata (names, description, ranges, default value, writability), and then store the default value of our Settings struct inside another Mutex inside the element struct.

In the next step we have to make use of these: we need to tell the GObject type system about the properties, and we need to implement functions that are called whenever a property value is set or get.

impl Rgb2Gray { fn class_init(klass: &mut BaseTransformClass) { [...] klass.install_properties(&PROPERTIES); [...] } } impl ObjectImpl<BaseTransform> for Rgb2Gray { fn set_property(&self, obj: &glib::Object, id: u32, value: &glib::Value) { let prop = &PROPERTIES[id as usize]; let element = obj.clone().downcast::<BaseTransform>().unwrap(); match *prop { Property::Boolean("invert", ..) => { let mut settings = self.settings.lock().unwrap(); let invert = value.get().unwrap(); gst_info!( self.cat, obj: &element, "Changing invert from {} to {}", settings.invert, invert ); settings.invert = invert; } Property::UInt("shift", ..) => { let mut settings = self.settings.lock().unwrap(); let shift = value.get().unwrap(); gst_info!( self.cat, obj: &element, "Changing shift from {} to {}", settings.shift, shift ); settings.shift = shift; } _ => unimplemented!(), } } fn get_property(&self, _obj: &glib::Object, id: u32) -> Result<glib::Value, ()> { let prop = &PROPERTIES[id as usize]; match *prop { Property::Boolean("invert", ..) => { let settings = self.settings.lock().unwrap(); Ok(settings.invert.to_value()) } Property::UInt("shift", ..) => { let settings = self.settings.lock().unwrap(); Ok(settings.shift.to_value()) } _ => unimplemented!(), } } }

Property values can be changed from any thread at any time, that’s why the Mutex is needed here to protect our struct. And we’re using a new mutex to be able to have it locked only for the shorted possible amount of time: we don’t want to keep it locked for the whole time of the transform function, otherwise applications trying to set/get values would block for up to one frame.

In the property setter/getter functions we are working with a glib::Value. This is a dynamically typed value type that can contain values of any type, together with the type information of the contained value. Here we’re using it to handle an unsigned integer (u32) and a boolean for our two properties. To know which property is currently set/get, we get an identifier passed which is the index into our PROPERTIES array. We then simply match on the name of that to decide which property was meant

With this implemented, we can already compile everything, see the properties and their metadata in gst-inspect-1.0 and can also set them on gst-launch-1.0 like this

# Set invert to true and shift to 128 gst-launch-1.0 videotestsrc ! rsrgb2gray invert=true shift=128 ! videoconvert ! autovideosink

If we set GST_DEBUG=rsrgb2gray:6 in the environment before running that, we can also see the corresponding debug output when the values are changing. The only thing missing now is to actually make use of the property values for the processing. For this we add the following changes to bgrx_to_gray and the transform function

impl Rgb2Gray { #[inline] fn bgrx_to_gray(in_p: &[u8], shift: u8, invert: bool) -> u8 { [...] let gray = ((r * R_Y) + (g * G_Y) + (b * B_Y)) / 65536; let gray = (gray as u8).wrapping_add(shift); if invert { 255 - gray } else { gray } } } impl BaseTransformImpl<BaseTransform> for Rgb2Gray { fn transform( &self, element: &BaseTransform, inbuf: &gst::Buffer, outbuf: &mut gst::BufferRef, ) -> gst::FlowReturn { let settings = *self.settings.lock().unwrap(); [...] let gray = Rgb2Gray::bgrx_to_gray(in_p, settings.shift as u8, settings.invert); [...] } }

And that’s all. If you run the element in gst-launch-1.0 and change the values of the properties you should also see the corresponding changes in the video output.

Note that we always take a copy of the Settings struct at the beginning of the transform function. This ensures that we take the mutex only the shorted possible amount of time and then have a local snapshot of the settings for each frame.

Also keep in mind that the usage of the property values in the bgrx_to_gray function is far from optimal. It means the addition of another condition to the calculation of each pixel, thus potentially slowing it down a lot. Ideally this condition would be moved outside the inner loops and the bgrx_to_gray function would made generic over that. See for example this blog post about “branchless Rust” for ideas how to do that, the actual implementation is left as an exercise for the reader.

What next?

I hope the code walkthrough above was useful to understand how to implement GStreamer plugins and elements in Rust. If you have any questions, feel free to ask them here in the comments.

The same approach also works for audio filters or anything that can be handled in some way with the API of the BaseTransform base class. You can find another filter, an audio echo filter, using the same approach here.

In the next blog post in this series I’ll show how to use another base class to implement another kind of element, but for the time being you can also check the GIT repository for various other element implementations.

Federico Mena-Quintero: Librsvg gets Continuous Integration

Planet GNOME - Pre, 12/01/2018 - 9:04md

One nice thing about gitlab.gnome.org is that we can now have Continuous Integration (CI) enabled for projects there. After every commit, the CI machinery can build the project, run the tests, and tell you if something goes wrong.

Carlos Soriano posted a "tips of the week" mail to desktop-devel-list, and a link to how Nautilus implements CI in Gitlab. It turns out that it's reasonably easy to set up: you just create a .gitlab-ci.yml file in the toplevel of your project, and that has the configuration for what to run on every commit.

Of course instead of reading the manual, I copied-and-pasted the file from Nautilus and just changed some things in it. There is a .yml linter so you can at least check the syntax before pushing a full job.

Then I read Robert Ancell's reply about how simple-scan builds its CI jobs on both Fedora and Ubuntu... and then the realization hit me:

This lets me CI librsvg on multiple distros at once. I've had trouble with slight differences in fontconfig/freetype in the past, and this would let me catch them early.

However, people on IRC advised against this, as we need more hardware to run CI on a large scale.

Linux distros have a vested interest in getting code out of gnome.org that works well. Surely they can give us some hardware?

Xubuntu: Xubuntu 17.10.1 Release

Planet Ubuntu - Pre, 12/01/2018 - 6:34md

Following the recent testing of a respin to deal with the BIOS bug on some Lenovo machines, Xubuntu 17.10.1 has been released. Official download sources have been updated to point to this point release, but if you’re using a mirror, be sure you are downloading the 17.10.1 version.

No changes to applications are included, however, this release does include any updates made between the original release date and now.

Note: Even with this fix, you will want to update your system to make sure you get all security fixes since the ISO respin, including the one for Meltdown, addressed in USN-3523, which you can read more about here.

Xubuntu: Xubuntu 17.04 End Of Life

Planet Ubuntu - Pre, 12/01/2018 - 3:40md

On Saturday 13th January 2018, Xubuntu 17.04 goes End of Life (EOL). For more information please see the Ubuntu 17.04 EOL Notice.

We strongly recommend upgrading to the current regular release, Xubuntu 17.10.1, as soon as practical. Alternatively you can download the current Xubuntu release and install fresh.

The 17.10.1 release recently saw testing across all flavors to address the BIOS bug found after its release in October 2017. Updated and bugfree ISO files are now available.

Raphaël Hertzog: Freexian’s report about Debian Long Term Support, December 2017

Planet Ubuntu - Pre, 12/01/2018 - 3:15md

Like each month, here comes a report about the work of paid contributors to Debian LTS.

Individual reports

In October, about 142 work hours have been dispatched among 12 paid contributors. Their reports are available:

Evolution of the situation

The number of sponsored hours did not change at 183 hours per month. It would be nice if we could continue to find new sponsors as the amount of work seems to be slowly growing too.

The security tracker currently lists 21 packages with a known CVE and the dla-needed.txt file 16 (we’re a bit behind in CVE triaging apparently). Both numbers show a significant drop compared to last month. Yet the number of DLA released was not larger than usual (30), instead it looks like December brought us fewer new security vulnerabilities to handle and at the same time we used this opportunity to handle lower priorities packages that were kept on the side for multiple months.

Thanks to our sponsors

New sponsors are in bold (none this month).

No comment | Liked this article? Click here. | My blog is Flattr-enabled.

Valorie Zimmerman: Seeding new ISOs the easy zsync way

Planet Ubuntu - Pre, 12/01/2018 - 9:39pd
Kubuntu recently had to pull our 17.10 ISOs because of the so-called lenovo bug. Now that this bug is fixed, the ISOs have been respun, and so now it's time to begin to reseed the torrents.

To speed up the process, I wanted to zsync to the original ISOs before getting the new torrent files. Simon kindly told me the easy way to do this - cd to the directory where the ISOs live, which in my case is 

cd /media/valorie/Data/ISOs/

Next: 

cp kubuntu-17.10{,.1}-desktop-amd64.iso && zsync http://cdimage.ubuntu.com/kubuntu/releases/17.10.1/release/kubuntu-17.10.1-desktop-amd64.iso.zsync

Where did I get the link to zsync? At http://cdimage.ubuntu.com/kubuntu/releases/17.10.1/release/. All ISOs are found at cdimage, just as all torrents are found at http://torrent.ubuntu.com:6969/.

The final step is to download those torrent files (pro-tip: use control F) and tell Ktorrent to seed them all! I seed all the supported Ubuntu releases. The more people do this, the faster torrents are for everyone. If you have the bandwidth, go for it!

PS: you don't have to copy all the cdimage URLs. Just up-arrow and then back-arrow through your previous command once the sync has finished, edit it, hit return and you are back in business.

Lubuntu Blog: Lubuntu 17.10.1 (Artful Aardvark) released!

Planet Ubuntu - Pre, 12/01/2018 - 7:29pd
Lubuntu 17.10.1 has been released to fix a major problem affecting many Lenovo laptops that causes the computer to have BIOS problems after installing. You can find more details about this problem here. Please note that the Meltdown and Spectre vulnerabilities have not been fixed in this ISO, so we advise that if you install […]

Valorie Zimmerman: Beginning 2018

Planet Ubuntu - Dje, 07/01/2018 - 11:55md
2017 began with the once-in-a-lifetime trip to India to speak at KDE.Conf.in. That was amazing enough, but the trip to a local village, and visiting the Kaziranga National Park were too amazing for words.

Literal highlight of last year were the eclipse and trip to see it with my son Thomas, and Christian and Hailey's wedding, and the trip to participate with my daughter Anne, while also spending some time with son Paul, his wife Tara and my grandson Oscar. This summer I was able to spend a few days in Brooklyn with Colin and Rory as well on my way to Akademy. So 2017 was definitely worth living through!

This is reality, and we can only see it during a total eclipse
2018 began wonderfully at the cabin. I'm looking forward to 2018 for a lot of reasons.
First, I'm so happy that soon Kubuntu will again be distributing 17.10 images next week. Right now we're in testing in preparation for that; pop into IRC if you'd like to help with the testing (#kubuntu-devel). https://kubuntu.org/getkubuntu/ next week!

Lubuntu has a nice write-up of the issues and testing procedures: http://lubuntu.me/lubuntu-17-04-eol-and-lubuntu-17-10-respins/

The other serious problems with meltdown and spectre are being handled by the Ubuntu kernel team and those updates will be rolled out as soon as testing is complete. Scary times when dealing with such a fundamental flaw in the design of our computers!

Second, in KDE we're beginning to ramp up for Google Summer of Code. Mentors are preparing the ideas page on the wiki, and Bhushan has started the organization application process. If you want to mentor or help us administer the program this year, now is the time to get in gear!

At Renton PFLAG we had our first support meeting of the year, and it was small but awesome! Our little group has had some tough times in the past, but I see us growing and thriving in this next year.

Finally, my local genealogy society is doing some great things, and I'm so happy to be involved and helping out again. My own searching is going well too. As I find more supporting evidence to the lives of my ancestors and their families, I feel my own place in the cosmos more deeply and my connection to history more strongly. I wish I could link to our website, but Rootsweb is down and until we get our new website up......

Finally, today I saw a news article about a school in India far outside the traditional education model. Called the Tamarind Tree School, it uses an open education model to offer collaborative, innovative learning solutions to rural students. They use free and open source software, and even hardware so that people can build their own devices. Read more about this: https://opensource.com/article/18/1/tamarind-tree-school-india.

Eric Hammond: Streaming AWS DeepLens Video Over SSH

Planet Ubuntu - Sht, 30/12/2017 - 6:00pd

instead of connecting to the DeepLens with HDMI micro cable, monitor, keyboard, mouse

Credit for this excellent idea goes to Ernie Kim. Thank you!

Instructions without ssh

The standard AWS DeepLens instructions recommend connecting the device to a monitor, keyboard, and mouse. The instructions provide information on how to view the video streams in this mode:

If you are connected to the DeepLens using a monitor, you can view the unprocessed device stream (raw camera video before being processed by the model) using this command on the DeepLens device:

mplayer –demuxer /opt/awscam/out/ch1_out.h264

If you are connected to the DeepLens using a monitor, you can view the project stream (video after being processed by the model on the DeepLens) using this command on the DeepLens device:

mplayer –demuxer lavf -lavfdopts format=mjpeg:probesize=32 /tmp/ssd_results.mjpeg Instructions with ssh

You can also view the DeepLens video streams over ssh, without having a monitor connected to the device. To make this possible, you need to enable ssh access on your DeepLens. This is available as a checkbox option in the initial setup of the device. I’m working to get instructions on how to enable ssh access afterwards and will update once this is available.

To view the video streams over ssh, we take the same mplayer command options above and the same source stream files, but send the stream over ssh, and feed the result to the stdin of an mplayer process running on the local system, presumably a laptop.

All of the following commands are run on your local laptop (not on the DeepLens device).

You need to know the IP address of your DeepLens device on your local network:

ip_address=[IP ADDRESS OF DeepLens]

You will need to install the mplayer software on your local laptop. This varies with your OS, but for Ubuntu:

sudo apt-get install mplayer

You can view the unprocessed device stream (raw camera video before being processed by the model) over ssh using the command:

ssh aws_cam@$ip_address cat /opt/awscam/out/ch1_out.h264 | mplayer –demuxer -

You can view the project stream (video after being processed by the model on the DeepLens) over ssh with the command:

ssh aws_cam@$ip_address cat /tmp/ssd_results.mjpeg | mplayer –demuxer lavf -lavfdopts format=mjpeg:probesize=32 -

Benefits of using ssh to view the video streams include:

  • You don’t need to have an extra monitor, keyboard, mouse, and micro-HDMI adapter cable.

  • You don’t need to locate the DeepLens close to a monitor, keyboard, mouse.

  • You don’t need to be physically close to the DeepLens when you are viewing the video streams.

For those of us sitting on a couch with a laptop, a DeepLens across the room, and no extra micro-HDMI cable, this is great news!

Bonus

To protect the security of your sensitive DeepLens video feeds:

  • Use a long, randomly generated password for ssh on your DeepLens, even if you are only using it inside a private network.

  • I would recommend setting up .ssh/authorized_keys on the DeepLens so you can ssh in with your personal ssh key, test it, then disable password access for ssh on the DeepLens device. Don’t forget the password, because it is still needed for sudo.

  • Enable automatic updates on your DeepLens so that Ubuntu security patches are applied quickly. This is available as an option in the initial setup, and should be possible to do afterwards using the standard Ubuntu unattended-upgrades package.

Unrelated side note: It’s kind of nice having the DeepLens run a standard Ubuntu LTS release. Excellent choice!

Original article and comments: https://alestic.com/2017/12/aws-deeplens-video-stream-ssh/

The VR Show

Planet Debian - Pre, 29/12/2017 - 5:39md

One of the things If I had got the visa on time for Debconf 15 (Germany) apart from the conference itself was the attention on VR (Virtual Reality) and AR (Augmented Reality) . I had heard the hype so much for so many years that I wanted to experience and did know that with Debianities who might be perhaps a bit better in crystal-gazing and would have perhaps more of an idea as I had then. The only VR which I knew about was from Hollywood movies and some VR videos but that doesn’t tell you anything. Also while movie like Chota-Chetan and others clicked they were far lesser immersive than true VR has to be.

I was glad that it didn’t happen after the fact as in 2016 while going to the South African Debconf I experienced VR at Qatar Airport in a Samsung showroom. I was quite surprised as how heavy the headset was and also surprised by how little content they had. Something which has been hyped for 20 odd years had not much to show for it. I was also able to trick the VR equipment as the eye/motion tracking was not good enough so if you put shook the head fast enough it couldn’t keep up with you.

I shared the above as I was invited to another VR conference by a web-programmer/designer friend Mahendra couple of months ago here in Pune itself . We attended the conference and were showcased quite a few success stories. One of the stories which was liked by the geek in me was framastore’s 360 Mars VR Experience on a bus the link shows how the framastore developers mapped Mars or part of Mars on Washington D.C. streets and how kids were able to experience how it would feel to be on Mars without knowing any of the risks the astronauts or the pioneers would have to face if we do get the money, the equipment and the technology to send people to Mars. In reality we are still decades from making such a trip keeping people safe to Mars and back or to have Mars for the rest of their life.

If my understanding is correct, the gravity of Mars is half of earth and once people settle there they or their exoskeleton would no longer be able to support Earth’s gravity, at least a generation who is born on Mars.

An interesting take on how things might turn out is shown in ‘The Expanse

But this is taking away from the topic at hand. While I saw the newer generation VR headsets there are still a bit ways off. It would be interesting once the headset becomes similar to eye-glasses and you do not have to either be tethered to a power unit or need to lug a heavy backpack full of dangerous lithium-ion battery. The chemistry for battery or some sort of self-powered unit would need to be much more safer, lighter.

While being in the conference and seeing the various scenarios being played out between potential developers and marketeers, it crossed my mind that people were not at all thinking of safe-guarding users privacy. Right from what games or choices you make to your biometric and other body sensitive information which has a high chance of being misused by companies and individuals.

There were also questions about how Sony and other developers are asking insane amounts for use of their SDK to develop content while it should be free as games and any content is going to enhance the marketability of their own ecosystem. For both the above questions (privacy and security asked by me) and SDK-related questions asked by some of the potential developers were not really answered.

At the end, they also showed AR or Augmented Reality which to my mind has much more potential to be used for reskilling and upskilling of young populations such as India and other young populous countries. It was interesting to note that both China and the U.S. are inching towards the older demographics while India would relatively be a still young country till another 20-30 odd years. Most of the other young countries (by median age) seem to be in the African continent and I believe (might be a myth) is that they are young because most of the countries are still tribal-like and they still are perhaps a lot of civil wars for resources.

I was underwhelmed by what they displayed in Augmented Reality, part of which I do understand that there may be lot many people or companies working on their IP and hence didn’t want to share or show or show a very rough work so their idea doesn’t get stolen.

I was also hoping somebody would take about motion-sickness or motion displacement similar to what people feel when they are train-lagged or jet-lagged. I am surprised that wikipedia still doesn’t have an article on train-lag as millions of Indians go through the process every year. The one which is most pronounced on Indian Railways is Motion being felt but not seen.

There are both challenges and opportunities provided by VR and AR but until costs come down both in terms of complexity, support and costs (for both the deployer and the user) it would remain a distant dream.

There are scores of ideas that could be used or done. For instance, the whole of North India is one big palace in the sense that there are palaces built by Kings and queens which have their own myth and lore over centuries. A story-teller could use a modern story and use say something like Chota Imambara or/and Bara Imambara where there have been lots of stories of people getting lost in the alleyways.

Such sort of lore, myths and mysteries are all over India. The Ramayana and the Mahabharata are just two of the epics which tell how grand the tales could be spun. The History of Indus Valley Civilization till date and the modern contestations to it are others which come to my mind.

Even the humble Panchtantra can be re-born and retold to generations who have forgotten it. I can’t express it much better as the variety of stories and contrasts to offer as bolokids does as well as SRK did in opening of IFFI. Even something like Khakee which is based on true incidents and a real-life inspector could be retold in so many ways. Even Mukti Bhavan which I saw few months ago, coincidentally before I became ill tells of stories which have complex stories and each person or persons have their own rich background which on VR could be much more explored.

Even titles such as the ever-famous Harry Potter or even the ever-beguiling RAMA could be shared and retooled for generations to come. The Shiva Trilogy is another one which comes to my mind which could be retold as well. There was another RAMA trilogy by the same author and another competing one which comes out in 2018 by an author called PJ Annan

We would need to work out the complexities of both hardware, bandwidth and the technologies but stories or content waiting to be developed is aplenty.

Once upon a time I had the opportunity to work, develop and understand make-believe walk-throughs (2-d blueprints animated/bought to life and shown to investors/clients) for potential home owners in a society (this was in the hey-days and heavy days of growth circa around y2k ) , it was 2d or 2.5 d environment, tools were lot more complex and I was the most inept person as I had no idea of what camera positioning and what source of light meant.

Apart from the gimmickry that was shown, I thought it would have been interesting if people had shared both the creative and the budget constraints while working in immersive technologies and bringing something good enough for the client. There was some discussion in a ham-handed way but not enough as there was considerable interest from youngsters to try this new medium but many lacked both the opportunities, knowledge, the equipment and the software stack to make it a reality.

Lastly, as far as the literature I have just shared bits and pieces of just the Indian English literature. There are 16 recognized Indian languages and all of them have a vibrant literature scene. Just to take an example, Bengal has been a bed-rock of new Bengali Detective stories all the time. I think I had shared the history of Bengali Crime fiction sometime back as well but nevertheless here it is again.

So apart from games, galleries, 3-d visual interactive visual novels with alternative endings could make for some interesting immersive experiences provided we are able to shed the costs and the technical challenges to make it a reality.


Filed under: Miscellenous Tagged: #Augmented Reality, #Debconf South Africa 2016, #Epics, #framastore, #indian literature, #Mars trip, #median age population inded, #motion sickness, #Palaces, #planet-debian, #Pune VR Conference, #RAMA, #RAMA trilogy, #Samsung VR, #Shiva Trilogy, #The Expanse, #Virtual Reality, #VR Headsets, #walkthroughs, Privacy shirishag75 https://flossexperiences.wordpress.com #planet-debian – Experiences in the community

Compute rescaling progress

Planet Debian - Pre, 29/12/2017 - 2:18md

My Lanczos rescaling compute shader for Movit is finally nearing usable performance improvements:

BM_ResampleEffectInt8/Fragment/Int8Downscale/1280/720/640/360 3149 us 69.7767M pixels/s BM_ResampleEffectInt8/Fragment/Int8Downscale/1280/720/320/180 2720 us 20.1983M pixels/s BM_ResampleEffectHalf/Fragment/Float16Downscale/1280/720/640/360 3777 us 58.1711M pixels/s BM_ResampleEffectHalf/Fragment/Float16Downscale/1280/720/320/180 3269 us 16.8054M pixels/s BM_ResampleEffectInt8/Compute/Int8Downscale/1280/720/640/360 2007 us 109.479M pixels/s [+ 56.9%] BM_ResampleEffectInt8/Compute/Int8Downscale/1280/720/320/180 1609 us 34.1384M pixels/s [+ 69.0%] BM_ResampleEffectHalf/Compute/Float16Downscale/1280/720/640/360 2057 us 106.843M pixels/s [+ 56.7%] BM_ResampleEffectHalf/Compute/Float16Downscale/1280/720/320/180 1633 us 33.6394M pixels/s [+100.2%]

Some tuning and bugfixing still needed; this is on my Haswell (the NVIDIA results are somewhat different). Upscaling also on its way. :-)

Steinar H. Gunderson http://blog.sesse.net/ Steinar H. Gunderson

Jackpot

Planet Debian - Pre, 29/12/2017 - 12:11md
I have no idea whatsover of how I achieved this, but there you go. This citizen's legal draft is moving forward to the Finnish parliament. Martin-Éric noreply@blogger.com Funkyware: ITCetera

Debian Policy call for participation -- December 2017

Planet Debian - Enj, 28/12/2017 - 11:47md

Yesterday we released Debian Policy 4.1.3.0, containing patches from numerous different contributors, some of them first-time contributors. Thank you to everyone who was involved!

Please consider getting involved in preparing the next release of Debian Policy, which is likely to be uploaded sometime around the end of January.

Consensus has been reached and help is needed to write a patch

#780725 PATH used for building is not specified

#793499 The Installed-Size algorithm is out-of-date

#823256 Update maintscript arguments with dpkg >= 1.18.5

#833401 virtual packages: dbus-session-bus, dbus-default-session-bus

#835451 Building as root should be discouraged

#838777 Policy 11.8.4 for x-window-manager needs update for freedesktop menus

#845715 Please document that packages are not allowed to write outside thei…

#853779 Clarify requirements about update-rc.d and invoke-rc.d usage in mai…

#874019 Note that the ’-e’ argument to x-terminal-emulator works like ’–’

#874206 allow a trailing comma in package relationship fields

Wording proposed, awaiting review from anyone and/or seconds by DDs

#515856 remove get-orig-source

#582109 document triggers where appropriate

#610083 Remove requirement to document upstream source location in debian/c…

#645696 [copyright-format] clearer definitions and more consistent License:…

#649530 [copyright-format] clearer definitions and more consistent License:…

#662998 stripping static libraries

#682347 mark ‘editor’ virtual package name as obsolete

#737796 copyright-format: support Files: paragraph with both abbreviated na…

#742364 Document debian/missing-sources

#756835 Extension of the syntax of the Packages-List field.

#786470 [copyright-format] Add an optional “License-Grant” field

#835451 Building as root should be discouraged

#845255 Include best practices for packaging database applications

#846970 Proposal for a Build-Indep-Architecture: control file field

#864615 please update version of posix standard for scripts (section 10.4)

Sean Whitton https://spwhitton.name//blog/ Notes from the Library

Get rid of the backpack

Planet Debian - Enj, 28/12/2017 - 11:43md

In 2008 I read a blog post by Mark Pilgrim which made a profound impact on me, although I didn't realise it at the time. It was

  1. Stop buying stuff you don’t need
  2. Pay off all your credit cards
  3. Get rid of all the stuff that doesn’t fit in your house/apartment (storage lockers, etc.)
  4. Get rid of all the stuff that doesn’t fit on the first floor of your house (attic, garage, etc.)
  5. Get rid of all the stuff that doesn’t fit in one room of your house
  6. Get rid of all the stuff that doesn’t fit in a suitcase
  7. Get rid of all the stuff that doesn’t fit in a backpack
  8. Get rid of the backpack

At the time I first read it, I think I could see (and concur) with the logic behind the first few points, but not further. Revisiting it now I can agree much further along the list and I'm wondering if I'm brave enough to get to the last step, or anywhere near it.

Mark was obviously going on a journey, and another stopping-off point for him on that journey was to delete his entire online persona, which is why I've linked to the Wayback Machine copy of the blog post.

jmtd http://jmtd.net/log/ Jonathan Dowland's Weblog

Successive Heresies

Planet Debian - Enj, 28/12/2017 - 2:37md

I prefer the book The Hobbit to The Lord Of The Rings.

I much prefer the Hobbit movies to the LOTR movies.

I like the fact the Hobbit movies were extended with material not in the original book: I'm glad there are female characters. I love the additional material with Radagast the Brown. I love the singing and poems and sense of fun preserved from what was a novel for children.

I find the foreshadowing of Sauron in The Hobbit movies to more effectively convey a sense of dread and power than actual Sauron in the LOTR movies.

Whilst I am generally bored by large CGI battles, I find the skirmishes in The Hobbit movies to be less boring than the epic-scale ones in LOTR.

jmtd http://jmtd.net/log/ Jonathan Dowland's Weblog

Reproducible Builds: Weekly report #139

Planet Debian - Enj, 28/12/2017 - 1:55md

Here's what happened in the Reproducible Builds effort between Sunday December 17 and Saturday December 23 2017:

Packages reviewed and fixed, and bugs filed

Bugs filed in Debian:

Bugs filed in openSUSE:

  • Bernhard M. Wiedemann:
    • WindowMaker (merged) - use modification date of ChangeLog, upstreamable
    • ntp (merged) - drop date
    • bzflag - version upgrade to include already-upstreamed SOURCE_DATE_EPOCH patch
Reviews of unreproducible packages

20 package reviews have been added, 36 have been updated and 32 have been removed in this week, adding to our knowledge about identified issues.

Weekly QA work

During our reproducibility testing, FTBFS bugs have been detected and reported by:

  • Adrian Bunk (6)
  • Matthias Klose (8)
diffoscope development strip-nondeterminism development disorderfs development reprotest development reproducible-website development
  • Chris Lamb:
    • rws3:
      • Huge number of formatting improvements, typo fixes, capitalisation
      • Add section headings to make splitting up easier.
  • Holger Levsen:
    • rws3:
      • Add a disclaimer that this part of the website is a Work-In-Progress.
      • Split notes from each session into separate pages (6 sessions).
      • Other formatting and style fixes.
      • Link to Ludovic Courtès' notes on GNU Guix.
  • Ximin Luo:
    • rws3:
      • Format agenda.md to look like previous years', and other fixes
      • Split notes from each session into separate pages (1 session).
jenkins.debian.net development Misc.

This week's edition was written by Ximin Luo and Bernhard M. Wiedemann & reviewed by a bunch of Reproducible Builds folks on IRC & the mailing lists.

Reproducible builds folks https://reproducible.alioth.debian.org/blog/ Reproducible builds blog

(Micro)benchmarking Linux kernel functions

Planet Debian - Enj, 28/12/2017 - 10:27pd

Usually, the performance of a Linux subsystem is measured through an external (local or remote) process stressing it. Depending on the input point used, a large portion of code may be involved. To benchmark a single function, one solution is to write a kernel module.

Minimal kernel module

Let’s suppose we want to benchmark the IPv4 route lookup function, fib_lookup(). The following kernel function executes 1,000 lookups for 8.8.8.8 and returns the average value.1 It uses the get_cycles() function to compute the execution “time.”

/* Execute a benchmark on fib_lookup() and put result into the provided buffer `buf`. */ static int do_bench(char *buf) { unsigned long long t1, t2; unsigned long long total = 0; unsigned long i; unsigned count = 1000; int err = 0; struct fib_result res; struct flowi4 fl4; memset(&fl4, 0, sizeof(fl4)); fl4.daddr = in_aton("8.8.8.8"); for (i = 0; i < count; i++) { t1 = get_cycles(); err |= fib_lookup(&init_net, &fl4, &res, 0); t2 = get_cycles(); total += t2 - t1; } if (err != 0) return scnprintf(buf, PAGE_SIZE, "err=%d msg=\"lookup error\"\n", err); return scnprintf(buf, PAGE_SIZE, "avg=%llu\n", total / count); }

Now, we need to embed this function in a kernel module. The following code registers a sysfs directory containing a pseudo-file run. When a user queries this file, the module runs the benchmark function and returns the result as content.

#define pr_fmt(fmt) "kbench: " fmt #include <linux/kernel.h> #include <linux/version.h> #include <linux/module.h> #include <linux/inet.h> #include <linux/timex.h> #include <net/ip_fib.h> /* When a user fetches the content of the "run" file, execute the benchmark function. */ static ssize_t run_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { return do_bench(buf); } static struct kobj_attribute run_attr = __ATTR_RO(run); static struct attribute *bench_attributes[] = { &run_attr.attr, NULL }; static struct attribute_group bench_attr_group = { .attrs = bench_attributes, }; static struct kobject *bench_kobj; int init_module(void) { int rc; /* ❶ Create a simple kobject named "kbench" in /sys/kernel. */ bench_kobj = kobject_create_and_add("kbench", kernel_kobj); if (!bench_kobj) return -ENOMEM; /* ❷ Create the files associated with this kobject. */ rc = sysfs_create_group(bench_kobj, &bench_attr_group); if (rc) { kobject_put(bench_kobj); return rc; } return 0; } void cleanup_module(void) { kobject_put(bench_kobj); } /* Metadata about this module */ MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("Microbenchmark for fib_lookup()");

In ❶, kobject_create_and_add() creates a new kobject named kbench. A kobject is the abstraction behind the sysfs filesystem. This new kobject is visible as the /sys/kernel/kbench/ directory.

In ❷, sysfs_create_group() attaches a set of attributes to our kobject. These attributes are materialized as files inside /sys/kernel/kbench/. Currently, we declare only one of them, run, with the __ATTR_RO macro. The attribute is therefore read-only (0444) and when a user tries to fetch the content of the file, the run_show() function is invoked with a buffer of PAGE_SIZE bytes as last argument and is expected to return the number of bytes written.

For more details, you can look at the documentation in the kernel and the associated example. Beware, random posts found on the web (including this one) may be outdated.2

The following Makefile will compile this example:

# Kernel module compilation KDIR = /lib/modules/$(shell uname -r)/build obj-m += kbench_mod.o kbench_mod.ko: kbench_mod.c make -C $(KDIR) M=$(PWD) modules

After executing make, you should get a kbench_mod.ko file:

$ modinfo kbench_mod.ko filename: /home/bernat/code/…/kbench_mod.ko description: Microbenchmark for fib_lookup() license: GPL depends: name: kbench_mod vermagic: 4.14.0-1-amd64 SMP mod_unload modversions

You can load it and execute the benchmark:

$ insmod ./kbench_mod.ko $ ls -l /sys/kernel/kbench/run -r--r--r-- 1 root root 4096 déc. 10 16:05 /sys/kernel/kbench/run $ cat /sys/kernel/kbench/run avg=75

The result is a number of cycles. You can get an approximate time in nanoseconds if you divide it by the frequency of your processor in gigahertz (25 ns if you have a 3 GHz processor).3

Configurable parameters

The module hard-code two constants: the number of loops and the destination address to test. We can make these parameters user-configurable by exposing them as attributes of our kobject and define a pair of functions to read/write them:

static unsigned long loop_count = 5000; static u32 flow_dst_ipaddr = 0x08080808; /* A mutex is used to ensure we are thread-safe when altering attributes. */ static DEFINE_MUTEX(kb_lock); /* Show the current value for loop count. */ static ssize_t loop_count_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { ssize_t res; mutex_lock(&kb_lock); res = scnprintf(buf, PAGE_SIZE, "%lu\n", loop_count); mutex_unlock(&kb_lock); return res; } /* Store a new value for loop count. */ static ssize_t loop_count_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) { unsigned long val; int err = kstrtoul(buf, 0, &val); if (err < 0) return err; if (val < 1) return -EINVAL; mutex_lock(&kb_lock); loop_count = val; mutex_unlock(&kb_lock); return count; } /* Show the current value for destination address. */ static ssize_t flow_dst_ipaddr_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { ssize_t res; mutex_lock(&kb_lock); res = scnprintf(buf, PAGE_SIZE, "%pI4\n", &flow_dst_ipaddr); mutex_unlock(&kb_lock); return res; } /* Store a new value for destination address. */ static ssize_t flow_dst_ipaddr_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) { mutex_lock(&kb_lock); flow_dst_ipaddr = in_aton(buf); mutex_unlock(&kb_lock); return count; } /* Define the new set of attributes. They are read/write attributes. */ static struct kobj_attribute loop_count_attr = __ATTR_RW(loop_count); static struct kobj_attribute flow_dst_ipaddr_attr = __ATTR_RW(flow_dst_ipaddr); static struct kobj_attribute run_attr = __ATTR_RO(run); static struct attribute *bench_attributes[] = { &loop_count_attr.attr, &flow_dst_ipaddr_attr.attr, &run_attr.attr, NULL };

The IPv4 address is stored as a 32-bit integer but displayed and parsed using the dotted quad notation. The kernel provides the appropriate helpers for this task.

After this change, we have two new files in /sys/kernel/kbench. We can read the current values and modify them:

# cd /sys/kernel/kbench # ls -l -rw-r--r-- 1 root root 4096 déc. 10 19:10 flow_dst_ipaddr -rw-r--r-- 1 root root 4096 déc. 10 19:10 loop_count -r--r--r-- 1 root root 4096 déc. 10 19:10 run # cat loop_count 5000 # cat flow_dst_ipaddr 8.8.8.8 # echo 9.9.9.9 > flow_dst_ipaddr # cat flow_dst_ipaddr 9.9.9.9

We still need to alter the do_bench() function to make use of these parameters:

static int do_bench(char *buf) { /* … */ mutex_lock(&kb_lock); count = loop_count; fl4.daddr = flow_dst_ipaddr; mutex_unlock(&kb_lock); for (i = 0; i < count; i++) { /* … */ Meaningful statistics

Currently, we only compute the average lookup time. This value is usually inadequate:

  • A small number of outliers can raise this value quite significantly. An outlier can happen because we were preempted out of CPU while executing the benchmarked function. This doesn’t happen often if the function execution time is short (less than a millisecond), but when this happens, the outliers can be off by several milliseconds, which is enough to make the average inadequate when most values are several order of magnitude smaller. For this reason, the median usually gives a better view.

  • The distribution may be asymmetrical or have several local maxima. It’s better to keep several percentiles or even a distribution graph.

To be able to extract meaningful statistics, we store the results in an array.

static int do_bench(char *buf) { unsigned long long *results; /* … */ results = kmalloc(sizeof(*results) * count, GFP_KERNEL); if (!results) return scnprintf(buf, PAGE_SIZE, "msg=\"no memory\"\n"); for (i = 0; i < count; i++) { t1 = get_cycles(); err |= fib_lookup(&init_net, &fl4, &res, 0); t2 = get_cycles(); results[i] = t2 - t1; } if (err != 0) { kfree(results); return scnprintf(buf, PAGE_SIZE, "err=%d msg=\"lookup error\"\n", err); } /* Compute and display statistics */ display_statistics(buf, results, count); kfree(results); return strnlen(buf, PAGE_SIZE); }

Then, We need an helper function to be able to compute percentiles:

static unsigned long long percentile(int p, unsigned long long *sorted, unsigned count) { int index = p * count / 100; int index2 = index + 1; if (p * count % 100 == 0) return sorted[index]; if (index2 >= count) index2 = index - 1; if (index2 < 0) index2 = index; return (sorted[index] + sorted[index+1]) / 2; }

This function needs a sorted array as input. The kernel provides a heapsort function, sort(), for this purpose. Another useful value to have is the deviation from the median. Here is a function to compute the median absolute deviation:4

static unsigned long long mad(unsigned long long *sorted, unsigned long long median, unsigned count) { unsigned long long *dmedian = kmalloc(sizeof(unsigned long long) * count, GFP_KERNEL); unsigned long long res; unsigned i; if (!dmedian) return 0; for (i = 0; i < count; i++) { if (sorted[i] > median) dmedian[i] = sorted[i] - median; else dmedian[i] = median - sorted[i]; } sort(dmedian, count, sizeof(unsigned long long), compare_ull, NULL); res = percentile(50, dmedian, count); kfree(dmedian); return res; }

With these two functions, we can provide additional statistics:

static void display_statistics(char *buf, unsigned long long *results, unsigned long count) { unsigned long long p95, p90, p50; sort(results, count, sizeof(*results), compare_ull, NULL); if (count == 0) { scnprintf(buf, PAGE_SIZE, "msg=\"no match\"\n"); return; } p95 = percentile(95, results, count); p90 = percentile(90, results, count); p50 = percentile(50, results, count); scnprintf(buf, PAGE_SIZE, "min=%llu max=%llu count=%lu 95th=%llu 90th=%llu 50th=%llu mad=%llu\n", results[0], results[count - 1], count, p95, p90, p50, mad(results, p50, count)); }

We can also append a graph of the distribution function (and of the cumulative distribution function):

min=72 max=33364 count=100000 95th=154 90th=142 50th=112 mad=6 value │ ┊ count 72 │ 51 77 │▒ 3548 82 │▒▒░░ 4773 87 │▒▒░░░░░ 5918 92 │░░░░░░░ 1207 97 │░░░░░░░ 437 102 │▒▒▒▒▒▒░░░░░░░░ 12164 107 │▒▒▒▒▒▒▒░░░░░░░░░░░░░░ 15508 112 │▒▒▒▒▒▒▒▒▒▒▒░░░░░░░░░░░░░░░░░░░░░░ 23014 117 │▒▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 6297 122 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 905 127 │▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 3845 132 │▒▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 6687 137 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 4884 142 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 4133 147 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 1015 152 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 1123 Benchmark validity

While the benchmark produces some figures, we may question their validity. There are several traps when writing a microbenchmark:

dead code
Compiler may optimize away our benchmark because the result is not used. In our example, we ensure to combine the result in a variable to avoid this.
warmup phase
One-time initializations may affect negatively the benchmark. This is less likely to happen with C code since there is no JIT. Nonetheless, you may want to add a small warmup phase.
too small dataset
If the benchmark is running using the same input parameters over and over, the input data may fit entirely in the L1 cache. This affects positively the benchmark. Therefore, it is important to iterate over a large dataset.
too regular dataset
A regular dataset may still affect positively the benchmark despite its size. While the whole dataset will not fit into L1/L2 cache, the previous run may have loaded most of the data needed for the current run. In the route lookup example, as route entries are organized in a tree, it’s important to not linearly scan the address space. Address space could be explored randomly (a simple linear congruential generator brings reproducible randomness).
large overhead
If the benchmarked function runs in a few nanoseconds, the overhead of the benchmark infrastructure may be too high. Typically, the overhead of the method presented here is around 5 nanoseconds. get_cycles() is a thin wrapper around the RDTSC instruction: it returns the number of cycles for the current processor since last reset. It’s also virtualized with low-overhead in case you run the benchmark in a virtual machine. If you want to measure a function with a greater precision, you need to wrap it in a loop. However, the loop itself adds to the overhead, notably if you need to compute a large input set (in this case, the input can be prepared). Compilers also like to mess with loops. At last, a loop hides the result distribution.
preemption
While the benchmark is running, the thread executing it can be preempted (or when running in a virtual machine, the whole virtual machine can be preempted by the host). When the function takes less than a millisecond to execute, one can assume preemption is rare enough to be filtered out by using a percentile function.
noise
When running the benchmark, noise from unrelated processes (or sibling hosts when benchmarking in a virtual machine) needs to be avoided as it may change from one run to another. Therefore, it is not a good idea to benchmark in a public cloud. On the other hand, adding controlled noise to the benchmark may lead to less artificial results: in our example, route lookup is only a small part of routing a packet and measuring it alone in a tight loop affects positively the benchmark.
syncing parallel benchmarks
While it is possible (and safe) to run several benchmarks in parallel, it may be difficult to ensure they really run in parallel: some invocations may work in better conditions because other threads are not running yet, skewing the result. Ideally, each run should execute bogus iterations and start measures only when all runs are present. This doesn’t seem a trivial addition.

As a conclusion, the benchmark module presented here is quite primitive (notably compared to a framework like JMH for Java) but, with care, can deliver some conclusive results like in these posts: “IPv4 route lookup on Linux” and “IPv6 route lookup on Linux.”

Alternative

Use of a tracing tool is an alternative approach. For example, if we want to benchmark IPv4 route lookup times, we can use the following process:

while true; do ip route get $((RANDOM%100)).$((RANDOM%100)).$((RANDOM%100)).5 sleep 0.1 done

Then, we instrument the __fib_lookup() function with eBPF (through BCC):

$ sudo funclatency-bpfcc __fib_lookup Tracing 1 functions for "__fib_lookup"... Hit Ctrl-C to end. ^C nsecs : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 3 |* | 512 -> 1023 : 1 | | 1024 -> 2047 : 2 |* | 2048 -> 4095 : 13 |****** | 4096 -> 8191 : 42 |********************|

Currently, the overhead is quite high, as a route lookup on an empty routing table is less than 100 ns. Once Linux supports inter-event tracing, the overhead of this solution may be reduced to be usable for such microbenchmarks.

  1. In this simple case, it may be more accurate to use:

    t1 = get_cycles(); for (i = 0; i < count; i++) { err |= fib_lookup(…); } t2 = get_cycles(); total = t2 - t1;

    However, this prevents us to compute more statistics. Moreover, when you need to provide a non-constant input to the fib_lookup() function, the first way is likely to be more accurate. 

  2. In-kernel API backward compatibility is a non-goal of the Linux kernel. 

  3. You can get the current frequency with cpupower frequency-info. As the frequency may vary (even when using the performance governor), this may not be accurate but this still provides an easier representation (comparable results should use the same frequency). 

  4. Only integer arithmetic is available in the kernel. While it is possible to approximate a standard deviation using only integers, the median absolute deviation just reuses the percentile() function defined above. 

Vincent Bernat https://vincent.bernat.im/en Vincent Bernat

Freezing of tasks failed

Planet Debian - Enj, 28/12/2017 - 7:33pd

It is interesting how a user-space task could lead to hinder a Linux kernel software suspend operation.

[11735.155443] PM: suspend entry (deep) [11735.155445] PM: Syncing filesystems ... done. [11735.215091] [drm:wait_panel_status [i915]] *ERROR* PPS state mismatch [11735.215172] [drm:wait_panel_status [i915]] *ERROR* PPS state mismatch [11735.558676] rfkill: input handler enabled [11735.608859] (NULL device *): firmware: direct-loading firmware rtlwifi/rtl8723befw_36.bin [11735.609910] (NULL device *): firmware: direct-loading firmware rtl_bt/rtl8723b_fw.bin [11735.611871] Freezing user space processes ... [11755.615603] Freezing of tasks failed after 20.003 seconds (1 tasks refusing to freeze, wq_busy=0): [11755.615854] digikam D 0 13262 13245 0x00000004 [11755.615859] Call Trace: [11755.615873] __schedule+0x28e/0x880 [11755.615878] schedule+0x2c/0x80 [11755.615889] request_wait_answer+0xa3/0x220 [fuse] [11755.615895] ? finish_wait+0x80/0x80 [11755.615902] __fuse_request_send+0x86/0x90 [fuse] [11755.615907] fuse_request_send+0x27/0x30 [fuse] [11755.615914] fuse_send_readpages.isra.30+0xd1/0x120 [fuse] [11755.615920] fuse_readpages+0xfd/0x110 [fuse] [11755.615928] __do_page_cache_readahead+0x200/0x2d0 [11755.615936] filemap_fault+0x37b/0x640 [11755.615940] ? filemap_fault+0x37b/0x640 [11755.615944] ? filemap_map_pages+0x179/0x320 [11755.615950] __do_fault+0x1e/0xb0 [11755.615953] __handle_mm_fault+0xc8a/0x1160 [11755.615958] handle_mm_fault+0xb1/0x200 [11755.615964] __do_page_fault+0x257/0x4d0 [11755.615968] do_page_fault+0x2e/0xd0 [11755.615973] page_fault+0x22/0x30 [11755.615976] RIP: 0033:0x7f32d3c7ff90 [11755.615978] RSP: 002b:00007ffd887c9d18 EFLAGS: 00010246 [11755.615981] RAX: 00007f32d3fc9c50 RBX: 000000000275e440 RCX: 0000000000000003 [11755.615982] RDX: 0000000000000002 RSI: 00007ffd887c9f10 RDI: 000000000275e440 [11755.615984] RBP: 00007ffd887c9f10 R08: 000000000275e820 R09: 00000000018d2f40 [11755.615986] R10: 0000000000000002 R11: 0000000000000000 R12: 000000000189cbc0 [11755.615987] R13: 0000000001839dc0 R14: 000000000275e440 R15: 0000000000000000 [11755.616014] OOM killer enabled. [11755.616015] Restarting tasks ... done. [11755.817640] PM: suspend exit [11755.817698] PM: suspend entry (s2idle) [11755.817700] PM: Syncing filesystems ... done. [11755.983156] rfkill: input handler disabled [11756.030209] rfkill: input handler enabled [11756.073529] Freezing user space processes ... [11776.084309] Freezing of tasks failed after 20.010 seconds (2 tasks refusing to freeze, wq_busy=0): [11776.084630] digikam D 0 13262 13245 0x00000004 [11776.084636] Call Trace: [11776.084653] __schedule+0x28e/0x880 [11776.084659] schedule+0x2c/0x80 [11776.084672] request_wait_answer+0xa3/0x220 [fuse] [11776.084680] ? finish_wait+0x80/0x80 [11776.084688] __fuse_request_send+0x86/0x90 [fuse] [11776.084695] fuse_request_send+0x27/0x30 [fuse] [11776.084703] fuse_send_readpages.isra.30+0xd1/0x120 [fuse] [11776.084711] fuse_readpages+0xfd/0x110 [fuse] [11776.084721] __do_page_cache_readahead+0x200/0x2d0 [11776.084730] filemap_fault+0x37b/0x640 [11776.084735] ? filemap_fault+0x37b/0x640 [11776.084743] ? __update_load_avg_blocked_se.isra.33+0xa1/0xf0 [11776.084749] ? filemap_map_pages+0x179/0x320 [11776.084755] __do_fault+0x1e/0xb0 [11776.084759] __handle_mm_fault+0xc8a/0x1160 [11776.084765] handle_mm_fault+0xb1/0x200 [11776.084772] __do_page_fault+0x257/0x4d0 [11776.084777] do_page_fault+0x2e/0xd0 [11776.084783] page_fault+0x22/0x30 [11776.084787] RIP: 0033:0x7f31ddf315e0 [11776.084789] RSP: 002b:00007ffd887ca068 EFLAGS: 00010202 [11776.084793] RAX: 00007f31de13c350 RBX: 00000000040be3f0 RCX: 000000000283da60 [11776.084795] RDX: 0000000000000001 RSI: 00000000040be3f0 RDI: 00000000040be3f0 [11776.084797] RBP: 00007f32d3fca1e0 R08: 0000000005679250 R09: 0000000000000020 [11776.084799] R10: 00000000058fc1b0 R11: 0000000004b9ac50 R12: 0000000000000000 [11776.084801] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000 [11776.084806] QXcbEventReader D 0 13268 13245 0x00000004 [11776.084810] Call Trace: [11776.084817] __schedule+0x28e/0x880 [11776.084823] schedule+0x2c/0x80 [11776.084827] rwsem_down_write_failed_killable+0x25a/0x490 [11776.084832] call_rwsem_down_write_failed_killable+0x17/0x30 [11776.084836] ? call_rwsem_down_write_failed_killable+0x17/0x30 [11776.084842] down_write_killable+0x2d/0x50 [11776.084848] do_mprotect_pkey+0xa9/0x2f0 [11776.084854] SyS_mprotect+0x13/0x20 [11776.084859] system_call_fast_compare_end+0xc/0x97 [11776.084861] RIP: 0033:0x7f32d1f7c057 [11776.084863] RSP: 002b:00007f32cbb8c8d8 EFLAGS: 00000206 ORIG_RAX: 000000000000000a [11776.084867] RAX: ffffffffffffffda RBX: 00007f32c4000020 RCX: 00007f32d1f7c057 [11776.084869] RDX: 0000000000000003 RSI: 0000000000001000 RDI: 00007f32c4024000 [11776.084871] RBP: 00000000000000c5 R08: 00007f32c4000000 R09: 0000000000024000 [11776.084872] R10: 00007f32c4024000 R11: 0000000000000206 R12: 00000000000000a0 [11776.084874] R13: 00007f32c4022f60 R14: 0000000000001000 R15: 00000000000000e0 [11776.084906] OOM killer enabled. [11776.084907] Restarting tasks ... done. [11776.289655] PM: suspend exit [11776.459624] IPv6: ADDRCONF(NETDEV_UP): wlp1s0: link is not ready [11776.469521] rfkill: input handler disabled [11776.978733] IPv6: ADDRCONF(NETDEV_UP): wlp1s0: link is not ready [11777.038879] IPv6: ADDRCONF(NETDEV_UP): wlp1s0: link is not ready [11778.022062] wlp1s0: authenticate with 50:8f:4c:82:4d:dd [11778.033155] wlp1s0: send auth to 50:8f:4c:82:4d:dd (try 1/3) [11778.038522] wlp1s0: authenticated [11778.041511] wlp1s0: associate with 50:8f:4c:82:4d:dd (try 1/3) [11778.059860] wlp1s0: RX AssocResp from 50:8f:4c:82:4d:dd (capab=0x431 status=0 aid=5) [11778.060253] wlp1s0: associated [11778.060308] IPv6: ADDRCONF(NETDEV_CHANGE): wlp1s0: link becomes ready [11778.987669] [drm:wait_panel_status [i915]] *ERROR* PPS state mismatch [11779.117608] [drm:wait_panel_status [i915]] *ERROR* PPS state mismatch [11779.160930] [drm:wait_panel_status [i915]] *ERROR* PPS state mismatch [11779.784045] [drm:wait_panel_status [i915]] *ERROR* PPS state mismatch [11779.913668] [drm:wait_panel_status [i915]] *ERROR* PPS state mismatch [11779.961517] [drm:wait_panel_status [i915]] *ERROR* PPS state mismatch 11:58 ♒♒♒ ☺ Categories: Keywords: Like:  Ritesh Raj Sarraf https://www.researchut.com/taxonomy/term/2 RESEARCHUT - Debian-Blog

Testing Ansible Playbooks With Vagrant

Planet Debian - Enj, 28/12/2017 - 12:00pd

I use Ansible to automate the deployments of my websites (LinuxJobs.fr, Journal du hacker) and my applications (Feed2toot, Feed2tweet). I’ll describe in this blog post my setup in order to test my Ansible Playbooks locally on my laptop.

Why testing the Ansible Playbooks

I need a simple and a fast way to test the deployments of my Ansible Playbooks locally on my laptop, especially at the beginning of writing a new Playbook, because deploying directly on the production server is both reeeeally slow… and risky for my services in production.

Instead of deploying on a remote server, I’ll deploy my Playbooks on a VirtualBox using Vagrant. This allows getting quickly the result of a new modification, iterating and fixing as fast as possible.

Disclaimer: I am not a profesionnal programmer. There might exist better solutions and I’m only describing one solution of testing Ansible Playbooks I find both easy and efficient for my own use cases.

My process
  1. Begin writing the new Ansible Playbook
  2. Launch a fresh virtual machine (VM) and deploy the playbook on this VM using Vagrant
  3. Fix the issues either from the playbook either from the application deployed by Ansible itself
  4. Relaunch the deployment on the VM
  5. If more errors, go back to step 3. Otherwise destroy the VM, recreate it and deploy to test a last time with a fresh install
  6. If no error remains, tag the version of your Ansible Playbook and you’re ready to deploy in production
What you need

First, you need Virtualbox. If you use the Debian distribution, this link describes how to install it, either from the Debian repositories either from the upstream.

Second, you need Vagrant. Why Vagrant? Because it’s a kind of middleware between your development environment and your virtual machine, allowing programmatically reproducible operations and easy linking your deployments and the virtual machine. Install it with the following command:

# apt install vagrant

Setting up Vagrant

Everything about Vagrant lies in the file Vagrantfile. Here is mine:

Vagrant.require_version ">= 2.0.0" Vagrant.configure(1) do |config| config.vm.box = "debian/stretch64" config.vm.provision "shell", inline: "apt install --yes git python3-pip" config.vm.provision "ansible" do |ansible| ansible.verbose = "v" ansible.playbook = "site.yml" ansible.vault_password_file = "vault_password_file" end end

Debian, the best OS to operate your online services

  1. The 1st line defines what versions of Vagrant should execute your Vagrantfile.
  2. The first loop of the file, you could define the following operations for as many virtual machines as you wish (here just 1).
  3. The 3rd line defines the official Vagrant image we’ll use for the virtual machine.
  4. The 4th line is really important: those are the missing apps we miss on the VM. Here we install git and python3-pip with apt.
  5. The next line indicates the start of the Ansible configuration.
  6. On the 6th line, we want a verbose output of Ansible.
  7. On the 7th line, we define the entry point of your Ansible Playbook.
  8. On the 8th line, if you use Ansible Vault to encrypt some files, just define here the file with your Ansible Vault passphrase.

When Vagrant launches Ansible, it’s going to launch something like:

$  ansible-playbook --inventory-file=/home/me/ansible/test-ansible-playbook/.vagrant/provisioners/ansible/inventory -v --vault-password-file=vault_password_file site.yml Executing Vagrant

After writing your Vagrantfile, you need to launch your VM. It’s as simple as using the following command:

$ vagrant up

That’s a slow operation, because the VM will be launched, the additionnal apps you defined in the Vagrantfile will be installed and finally your Playbook will be deployed on it. You should sparsely use it.

Ok, now we’re really ready to iterate fast. Between your different modifications, in order to test your deployments fast and on a regular basis, just use the following command:

$ vagrant provision

Once your Ansible Playbook is finally ready, usually after lots of iterations (at least that’s my case), you should test it on a fresh install, because your different iterations may have modified your virtual machine and could trigger unexpected results.

In order to test it from a fresh install, use the following command:

$ vagrant destroy && vagrant up

That’s again a slow operation. You should use it when you’re pretty sure your Ansible Playbook is almost finished. After testing your deployment on a fresh VM, you’re now ready to deploy in production.Or at least better prepared :p

Possible improvements? Let me know

I find the setup described in this blog post quite useful for my use cases. I can iterate quite fast especially when I begin writing a new playbook, not only on the playbook but sometimes on my own latest apps, not yet ready to be deployed in production. Deploying on a remote server would be both slow and dangerous for my services in production.

I could use a continous integration (CI) server, but that’s not the topic of this blog post.  As said before, the goal is to iterate as fast as possible in the beginning of writing a new Ansible Playbook.

Gitlab, offering Continuous Integration and Continuous Deployment services

Commiting, pushing to your Git repository and waiting for the execution of your CI tests is overkill at the beginning of your Ansible Playbook, when it’s full of errors waiting to be debugged one by one. I think CI is more useful later in the life of the Ansible Playbooks, especially when different people work on it and you have a set or code quality rules to enforce. That’s only my opinion and it’s open to discussion, one more time I’m not a professionnal programmer.

If you have better solutions to test Ansible Playbooks or to improve the one describe here, let me know by writing a comment or by contacting me through my accounts on social networks below, I’ll be delighted to listen to your improvements.

About Me

Carl Chenet, Free Software Indie Hacker, Founder of LinuxJobs.fr, a job board for Free and Open Source Jobs in France.

Follow Me On Social Networks

 

Carl Chenet https://carlchenet.com debian – Carl Chenet's Blog

Faqet

Subscribe to AlbLinux agreguesi