Kees Cook: API replacement/deprecation

September 8, 2018

Related Material: None found, maybe a new topic! ;–)

Additional Participants: Alexandre Belloni, Arnd Bergmann, Dan Carpenter, Geert Uytterhoeven, Jani Nikula, Jan Kara, Johannes Berg, Julia Lawall, Linus Walleij, Mark Brown, Maxime Ripard, Stephen Rothwell, Steven Rostedt, Takashi Iwai, and Theodore Y. Ts'o.

Kees would like to deprecate APIs more sanely, putting forward a “fast” method involving spending several releases creating a large patchbomb and a “slow” method incrementally removing uses of the old API. Kees notes that the patchbomb can take a long time to create, and that incremental removal might or might not keep up with the inevitable additions. Kees knows about checkpatch.pl, but notes that it does not always get run. Stephen Rothwell seconded Kees's concern, calling out pain in -next due to quickly removed APIs. Steven Rostedt suggested a kernel.org script or a zero-day bot that run the latest checkpatch.pl, ensuring that it always does get run. Maxime Ripard seconded the use of checkpatch.pl. Julia Lawall noted that there are Coccinelle scripts that the DRM people use to flag deprecated functions, and that new functions can easily be added, and posted an example. [ Ed. note: Some overlap between checkpatch.pl and Coccinelle, which is probably a good thing. ] She also suggested taking heart in increasing use of a new function. Jani Nikula suggested re-introducing __deprecated, but make it opt-in, so that only builds enabling it (via preprocessor magic or gcc command-line tricks) will see the warnings. CI-like builds (0day, -next, etc.) could then enable these warnings. Jani called out the virtues of flagging the function definitions themselves, rather than a centralized [ and perhaps conflict-prone ] file. Julia, Alexandre Belloni, and Mark Brown, pointed out that people adding deprecated functions might be committing copy-pasta, and thus would not notice the deprecation. Jani replied that the semantic patch could be generated from the deprecation markings. Dan Carpenter suggested a simple standalone perl script to flag deprecation using a simple one-line-per-deprecated-function data file.

Kees pointed out that even mandatory checks can be ignored by maintainers, and suggested an in-tree list of functions, which, if added, would be the responsibility of the author/maintainer to remove. Kees also noted the potential for CI, including adding CI into the patch-acceptance path on the one hand and adding new commits to the remit of the 0day test robot on the other. Stephen Rothwell volunteered to do this CI function as a part of -next if provided with a list of deprecated things. Kees immediately volunteered strcpy() and strncpy() (to be replaced by strscpy for Stephen's list. Takashi Iwai suggested a lightweight check script for the git-commit hook that all maintainers would install and run, but Johannes Berg threw his weight behind Coccinelle and the 0-day build bot. Julia cautioned that she suspect that this build bot reports only on modified code and even then only once. Geert Uytterhoeven called out git commit hooks, checkpatch.pl, and bot-run scripts as three classes of solutions in decreasing order of agility. Takashi likes the speed and convenience of running a script locally, but agrees that 0-day bot coverage would be a good thing.

Jan Kara pointed out that there is not always agreement on which functions are in fact to be deprecated, arguing that any CI-generated deprecation message should be accompanied with a good explanation why the change is needed. Theodore Y. Ts'o (Ted) gave as an example the use of strncpy() for a non-NUL-terminated string. Kees agreed that strncpy() does have valid use cases, but argued that there should be agreement on strcpy. Arnd Bergmann notes that NUL-padding rather than NUL-termination is sometimes required to avoid leaking stack data into filesystem metadata or over the network. Ted agreed, suggesting that there be a name for a memset() strncpy() sequence that does that job. Geert performed CI services for Ted's example code, as well as for Kees's original posting.

Linus Walleij noted that things like GPIO descriptors have similar issues.