Andy Lutomirski: Reviewing new API/ABI

May 6, 2014

Participants: Dan Carpenter, Daniel Vetter, Greg KH, Hans Verkuil, Jeff Layton, Jiri Kosina, Johannes Berg, Josh Triplett, Laurent Pinchart, Li Zefan, Michael Kerrisk, Randy Dunlap, Shuah Khan, Steven Rostedt, Wolfram Sang,

People tagged: Hans Verkuil.

Andy Lutomirski listed some unfortunate API/ABI bugs and asks for process improvements. Andy suggests that core API changes come with documentation, tests, and an acked email post to a list that deals only with API changes. Michael Kerrisk agrees with the problem, but says that his free time is already overcommited to that task. Michael also suggests ABI-changes-noted-by: and ABI-changes-doc-acked-by: tags in kernel git commits, and notes that thorough documentation is a key part of API/ABI review. Josh Triplett seconds Andy's API/ABI concern, noting that performance enhancements are expected to come with benchmarks and are rejected out of hand otherwise. Shuah Khan also agreed, calling out the need for test frameworks. Andy agreed that other discussions of in-tree testing is relevant. Shuah clarified that she was referring specifically to the need for regression tests.

Johannes Berg asks how far this should be taken, giving the example of subsystem APIs that are to be used only by a few networking utility programs/daemons. Andy Lutomirski argued that CVE-2014-0181 is evidence that even subsystem-specific APIs need attention. Johannes argued that despite CVE-2014-0181, netlink was not going away, being used less, or being restricted. Johannes raised the concern that covering subsystem APIs might overwhelm the review process. Andy agreed that it is unclear what review process would have caught this problem, and suggested that one way to prune the review process is to focus on APIs that can be used by non-root users and APIs that represent whole new mechanisms. Josh Triplett believes that all new APIs deserve a high degree of scrutiny, but notes that improvement is welcome, and that lack of perfection should not stand in the way of improvement. Daniel Vetter described the process used to vet new drm-driver APIs, which includes (1) Full userspace must be ready before kernel patches are merged, (2) Detailed test cases must be provided for all corner cases, and (3) Kernel patches include a “Testcase:” tag linking to the corresponding test case. Daniel expects to add documentation requirements in the next year or so. Andy would love to see a similar process in place for all new syscalls. Laurent Pinchart argued that the problem was not unwillingness to review new APIs, but rather ignorance of what constitutes a good review procedure. Laurent noted that V4L also has test-suite and documentation requirements for new APIs. However, Laurent believes that blindly imposing these processes across the whole kernel would be too burdensome, but expects that defining best practices would be valuable. Daniel Vetter expressed interest in such a best-practices discussion, and Laurent Pinchart suggested that V4L's Hans Verkuil would be a good addition to this discussion, as did Wolfram Sang.

Hans Verkuil pointed out that V4L's API is very large, which makes compliance tools such as test suites indespensible. In addition, applications using V4L need test drivers in order to test the application without requiring a large array of hard-to-get hardware. Hans has a nice test driver, but it will take some work to get it in shape for upstreaming. Finally, Hans's experience is that the act of creating these tools forces you to take a very hard look at the API, exposing many ambiguities. So writing these tools is almost as valuable as running them.

Greg KH called out the linux-api email list, which should be CCed for all new API additions, but which seems to have very little traffic. Steven Rostedt said that this was the first he heard of the list, which might help explain the low traffic. Steven did a quick git grep and found references to this list in Documentation/HOWTO and Documentation/SubmitChecklist, which led him to hypothesize that it is even harder to get people to read documentation than it is to get them to write it. Josh Triplett posted a patch adding this email list to the MAINTAINERS file. Steven Rostedt acked the patch, but noted that LKML should be CCed, and Josh responded by reposting the patch to LKML. Jiri Kosina asked that this patch be extended to cover sysfs. Jiri also wondered who, if anyone, was subscribed to linux-api. Dan Carpenter expects that no one will really CC this list on their own patches, asking why anyone would ask for such punishment, eliciting an interesting response from Randy Dunlap. Michael Kerrisk argues that persistence is the key. If everyone responds to patches that change the API/ABI by asking the submitter to CC linux-api (and to CC linux-api on that request), then the culture will change.

Li Zefan wonders whether, if sysfs is included, whether cgroupfs should also be included. He also further wonders whether core developers should be required to subscribe to linux-api. Jiri Kosina echos the concern over cgroupfs, adding that the number and types of interfaces between kernel and userspace continues to grow, and that it is not always clear which parts of this are considered proper ABI. Jiri also questions the well-definedness of “if we ever get a report about userspace regression because of kernel interface change”, giving userspace dependence on the contents of the kernel ringbuffer as an extreme example.

Jeff Layton described his experiences posting an API change that was not objected to until his patches were accepted into mainline. He would also like changes to be driving back into the POSIX specification. He suspects that the linux-api mailing list might have provided him the needed feedback sooner. That said, Jeff would like a balance between lots of eyes on the one hand and excessive formalism and pain on the other. Michael Kerrisk said that while posting the changes to linux-api might have helped, another issue was getting the right people reading that email list. Michael suggests Red Hat's Eric Blake as a good interface to POSIX for Red Hat employees. Finally, Michael points out that broken APIs and ABIs inflict substantial pain on large numbers of userspace developers, which indicates that we are a very long way from excessive pain being inflicted on kernel developers.