More useful types in the linux kernel

August 5, 2016

Related Material::

  1. Simply typed lambda calculus
  2. Separation logic (which was the topic of this year's Godel Prize)
  3. sparse wiki

Additional Participants: Alexey Dobriyan, Arnd Bergmann, Bird, Timothy, David Howells, David Woodhouse, Dmitry Torokhov, Eric W. Biederman, Guenter Roeck, Hannes Reinecke, James Bottomley, Jani Nikula, Jan Kara, Jiri Kosina, Josh Triplett, Julia Lawall, Linus Walleij, Mark Brown, NeilBrown, Rob Herring, Steven Rostedt, and Vlastimil Babka.

People tagged: (none)

Eric W. Biederman suggested looking into complex types, including simply typed lambda calculus and separation logic. Eric would like some sparse-like checking in gcc, checking for reference-count leaks, checking for unpaired locks and unlocks, and handling of doubly linked lists. James Bottomley suggested type-based extensions to sparse as an alternative to defining yet another programming language. Eric replied that he is not sure how to retrofit these extensions to C's type system, given that he wants to detect memory errors, reference-counting errors, and deadlocks within the type system.

Types and Rust

Josh Triplett called out several gcc bugzillas that added more type checking, but noted that ownership-tracking type systems get complex very quickly, noting that it took Rust numerous iterations to come up with something that worked. Josh also noted the relationship between high false-positive rate and slow adoption. Eric made it clear that he wants to go far beyond what Rust does, but that he expects to arrive at something that is more expressive and simpler than Rust's type system. He expects zero false-positive rate, but fully expects that explicit type conversions will be required. Eric later amplified his position, especially with respect to Rust. Josh Triplett asked for clarification of the amplification. Eric distinguished between Rust's data lifetimes and ownership of data by code, giving a reference-counting example. For example, if code owns data, then doubly linked lists to not pose challenges due to circular ownership. Eric said that he is working on a proof of concept. Josh asked what happened in case of copying or overwriting of pointers, especially in cases were this resulted in invalidation of other data. In that time-honored tradition of software developers, Eric responded with “It depends”.

Julia Lawall suggested running time-consuming tools on just changed files, rather than on the kernel as a whole. Jan Kara and Steven Rostedt apparently consider sparse to be time consuming, so pointed out that it runs as part of 0-day testing.

Mark Brown pointed out that it would be good to increase the scope from just gcc to include llvm. There was some debate about the merits of llvm, leading to its proposal as a separate Tech Talk.

Refined Linux-Kernel Types and Coccinelle

David Howells suggested refining some Linux-kernel types, including a bits and perhaps also a bits64 type, different types for BH and non-BH spinlocks, and increased use of the type bool. David also suggested an on_list() function with the same implementation as list_empty() but with different semantics. Dmitry Torokhov suggested use of DECLARE_BITMAP() as an alternative to bits and bits64, calling out its self-sizing properties. David Howells said that the issue was unsigned long, which often wastes 32 bits on 64-bit systems. Alexey Dobriyan agreed with David, but prefers the names flags_t and flags64_t, further suggesting that functions taking bits use __builtin_choose_expr or _Generic to dispatch to the appropriate implementation. David Howells was OK with Alexey's names, other than possible confusion with the irq-disable flags argument. Alexey suggested resurrecting irq_flags_t, but was concerned about the size of the patch and the time required to get it ready for upstream, much less upstream. This led to a debate about Coccinelle's capabilities, which Julia Lawall ended by pointing out that recent versions of Coccinelle do have the needed functionality.

Dmitry Torokhov suggested u8, u16, and u32 (along with the BIT() macro) for small bigmaps and DECLARE_BITMAP() for larger ones. Guenter Roeck noted challenges wtih DECLARE_BITMAP() and static initialization.

Steven Rostedt disputed the usefulness of typing to distinguish between BH and non-BH locks, noting that it is perfectly legal to use a non-BH locking primitives on a BH lock while in a BH-disable region of code. David Howells agreed.

Jani Nikula agreed with use of bool in principle, but noted that in practice gcc sometimes generates worse code. David Woodhouse asked why no gcc bug had been filed, and Jani then requested that David file the bug.

Range Checking (And Some Coccinelle)

Hannes Reinecke would like stricter range checking for kernel functions, including use of enums and error values. Hannes would also like to use this information to help with error injection via systemtap. Julia Lawall gave an example where gcc allows mixing of different enums (and noted difficulties presented by function pointers). Hannes believes that gcc only checks enums in switch statements. Arnd Bergmann notes that passing an enum where a function expects an int and vice versa is a common bug, and that checking might be added to gcc as a default option. NeilBrown noted that sparse gives warnings on mixing enum types that are marked with __attribute((bitwise)).

Hannes also calls out the possibilty of specifying exactly which errnos a given function is permitted to return. Julia suspects that such checks might be possible. David Woodhouse is concerned that the list of permitted errnos would always be out of date, noting that accepting any errno returned is good defensive programming. He suggested IS_ERR_VALUE() as a good way to check that the integer in question might really be an errno. Hannes said that he is concerned about range collisions, where the same variable takes the return value from two functions that have different semantics for the returned value. He suggests that range restrictions, such as 0..4096, could make life easier. Alexey Dobriyan argued that even this simple restriction would require rewriting the kernel in a language with a real type system, and typed exceptions as an alternative.

Julia suggested using a script to keep lists of errno values up to date, but James Bottomley expressed concern that the work required would still overshadow the benefits, holding out the section-mismatch experience as an example. David Woodhouse thought that such scripts might be useful to check for overlaps between errno returns and valid returns, but otherwise agreed with James. Tim Bird suggested encoding line numbers into errno values to more easily determine where a given errno was generated. Rob Herring suggested use of tracepoints to identify where a given errno was generated.

Cross Compilation

David Howells raised the issue of cross-compilation, which concerns Linus Walleij as well. Packaging for cross-compilers is also a challenge.