Peter Huewe: Recruitment (Reviewers, Testers, Maintainers, Hobbyists)

July 23, 2015

Related Material:

2014 ksummit-discuss: Jason Cooper: hobbyist recruiting
August 20, 2014 LWN article: Two sessions on review

Additional Participants: Bjorn Helgaas, Chris Mason, Christian Couder, Damien Lespiau, Dan Carpenter, Daniel Vetter, Dan Williams, Darren Hart, Davidlohr Bueso, David Woodhouse, Dmitry Torokhov, Fengguang Wu, Frank Rowand, Geert Uytterhoeven, Greg KH, Guenter, Guenter Roeck, Heiko Carstens, James Bottomley, Jan Kara, Jason Cooper, Jiri Kosina, Joe Perches, John W. Linville, Jonathan Cameron, Jonathan Corbet, Josh Boyer, Josh Poimboeuf, Josh Triplett, Julia Lawall, Krzysztof Koz, Laura Abbott, Laurent Pinchart, Luis R. Rodriguez, Mark Brown, Mel Gorman, Michael Ellerman, NeilBrown, Olof Johansson, Peter Huewe, Rafael J. Wysocki, Ralf Baechle, Roberto Tyley, Steven Rostedt, Sudip Mukherjee, Takashi Iwai, Theodore Ts'o, Tim Bird, Tony Luck, and Zefan Li.

People tagged by Peter Huewe: Greg KH, Jason Coooper, Little Penguin (?), Russell King, Sarah Sharp, Sebastian Hesslebarth, and Tim Bird.

Peter Huewe extends last year's hobbyist recruiting topic to recruiting in general. Peter states that we are short on reviewers, that maintainers are overloaded, and that not enough of the patches are tested. He would therefore like to discuss encouraging hobbyists to become regular contributors, reducing drop-out rates, encouraging regular contributors to become reviewers and testers, and having reviewers become co-maintainers and maintainers. Peter believes that the TPM subsystem has made good progress from essentially unmaintained to having a part-time maintainer and a pair of active reviewers/testers.

Peter quotes Jon Corbet's 4.1 development statistics noting that more than 60% of changes pass through the hands of developers working for just five companies, and asks how to motivate more companies to sponsor maintainership (or, alternatively, the Linux Foundation). Peter lists his own employer as an example of a company unwilling to sponsor maintainership.

Finally, Peter asks after the Eudyptula Challenge and Outreachy.

[ Ed. note: As of July 23rd, there are more than 300 emails in this thread. I doubt that anyone would actually read a full blow-by-blow summary written in my usual style, and I also doubt that I would finish writing it in time for Kernel Summit, so this summary even more summarized than usual. ]

I have partitioned the remainder of the discussion into the following categories:

Views From Newbies
Tricks and Hacks
Devil's Advocacy
Encouraging Reviews vs Encouraging Gaming

Views From Newbies

Sudip Mukherjee is a recent arrival. Sudip says that his problem was not the mechanics, but rather getting response from maintainers when moving beyond trivial fixes. Julia Lawall suggested looking at maintainer to-do lists and to find responsive maintainers. The ensuing discussion re-invented kernel-newbies and kernel-mentors. Steven Rostedt suggested re-pinging the maintainer with new version of full patch, sparking a discussion of the relative merits of pinging with a new version of the patch, pinging with empty emails, and also on the acceptable timing: Most seemed OK with a once-per-week ping, but others preferred a longer inter-ping interval. Sudip asked that maintainers auto-reply with their preferences, and Tim Bird countered that this information should be in the MAINTAINERS file. Mark Brown was leary of providing too much detail.

Tim Bird said: “There's a highly qualified developer at Sony I was talking to this week, who said that when he sees a bug in the mainline kernel, he usually just ignores it because the hassle-factor of submitting a patch is too high.” This prompted a number of suggestions:

Reward the bug fixer, but have someone else submit the fix upstream.
A “patch penguin” for first patches, perhaps sponsored by Linux Foundation.
Have maintainers learn to accept imperfect patches. (Which is more likely for patches that the maintainer believes to be highly valuable.) Andrew Morton's -fix approach was lauded.
Some people are too proud to simply send a bug report, feeling the need to include a fix, but not wanting to include a fix that might not go upstream. One way to address this is to make it clear that it is just fine to include a prototype patch in the bug report, as long as the bug report includes permission to use that patch.
Face-to-face meetings can help, though not everyone will be able to attend such a meeting.
Some don't believe that "people like them" are supposed to be sending patches, especially if they mostly work on proprietary code.

Tim Bird reported on a survey at his employer showing that 93 of 278 qualified developers (33%) refrain from submitting patches to LKML. 25% said they didn't know how to send a patch. Best guess rate of conversion to upstream contributor is 10%.

Better Blaming

Zefan Li noted that cosmetic changes can break "git blame", to which Julia Lawall responded that this situation indicates a need to improve “git blame” work better, to general acclaim. And to some suggestions on better use of existing tools:

Bjorn Helgaas called out “git log -p file” to find recent changes in a given file.
James Bottomley argued that it is often necessary to focus the search on a small fraction of a file, but then found the following does just that:
“git log -L<lines> -p <file>”
Josh Triplett called attention to “vim-fugitive”, which has a “:Gblame” mode in which “o” opens a commit and “P” moves to the parent commit.

Formatting of Patches

Julia Lawall suggested that a test email list be created to help newbies resolve their patch-sbumission problems. Of course, this suggestion prompted people to advocate for their favorite patch-formatting tools and procedures:

David Woodhouse suggested saving email as draft and reviewing it both manually and using local tools.
Mark Brown advocated for simple “git send-email”, which others argued was not so simple, even given the availability of the --dry-run argument. Jason Cooper suggested adding a new --kernel flag to set the usual kernel defaults and CCs, but Mark argued that auto-CC actively punishes wide-area cleanups with an eternity of spam. Josh Triplett suggested specifying the existing --git-fallback flag to get_maintainers.pl, but Mark was not satisfied.
Steven Rostedt sung the praises of quilt.
Darren Hart uses custom scripting to create and to send pull requests.
Josh Triplett used to use “git imap-send” and points to “mutt -H” for sending the prefabricated email.
The merits of “git commit --allow-empty” for cover letters was debated. The need for the --start-number flag on “git format-patch” was noted, as was the need to beware of “git rebase”.
Dan Williams forwarded a script that sends the patch series's cover letter to the union of the sets of recipients for the patches in the series.

Automating Patch Submission

Darren Hart suggested that first-patch submission could be eased via a patch-submission web form. Steven Rostedt provided a simple sample web form layout, and Laura Abbot pointed to an existing patch-submission form. Julia Lawall worried that it might be difficult to transition from the tool to the email approach, which sparked a discussion about whether or not the process should be driven from git, gitk, or something like gerrit rather than from web forms. Christian Coude called attention to Roberto Tyley's submitGit (source code), who in turn called out the difficulties in determining who and what should be CCed for a given patch. Some felt that the existing web-submission sites should be considered to be precedents rather than prototypes.

Organizing Incoming Patches

The thread ordering provided by email clients was called out as an obstacle to handling of incoming patches, in particular, spotting when a new series has obsoleted an old series. Jiri Kosina suggested use of sort_aux in mutt and thread-sorts-by-arrival in pine.

Ralf Baechle wondered why patchworks hadn't yet obsoleted all competitors, and Tim Bird responded that less than 10% of subsystems use it, based on the prevalence of the Q: tag in MAINTAINERS. There was some concern that the patchworks auto-email on state change might discourage new submitters, who might see a series of emails as their patch was handed off from maintainer to maintainer and wonder when the maintainers would stop passing the buck. Bjorn Helgaas agreed that the state changes might be discouraging, but said that he needs this to get the patch to the right place. Bjorn also suggested that the ability to annotate the email might help. Laurent Pinchart has modified patchworks to do the needed patch delegation automatically, and Michael Ellerman indicated an interest in this modification.

Guenter Roeck suspects that deferring patches to the next release might also be discouraging to new submitters.

Daniel Vetter said that auto-deprecation was in the works that would allow a new version of a patch series to obsolete the previous version. For his part, Damien Lespiau has modifications to patchworks that allows the maintainer to ignore all patches except those sent from git.

Bug-Tracking Annoyances

Josh Boyer says that use of WARN_ON() as a fixme is annoying, and Steven Rostedt agrees, advocating loud complaints to the relevant maintainers every time one occurs. Josh replied that i915 has lots of them, and asked if Steven is running headless. Steven replied that complaints have had little effect effect there, so he gave up and silenced those warnings. However, Steven suggested that it might be time to start complaining again.

Checkpatch.pl, Foul and Fair

Steven Rostedt complained that checkpatch.pl generates too much noise, so much so that he uses an older and less noisy version. He finds “But checkpatch says...” hectoring annoying. Josh Triplett suggested turning off the complaints having a high false-positive rate. Dan Carpenter responded with a distribution of warning types for Steven's event-tracing macros on the one hand and driver code on the other. Dan found that there is in fact quite a bit of noise for the deathly macros, but not so much for drivers. In both cases, the 80-character limit was responsible for most of the noise. Joe Perches suggested a re-evaluation of checkpatch complaints, perhaps moving some from ERROR or WARNING to SUGGESTION.

0day Test Robot

Some suggested that 0day Test Robot automatically check patches submitted to LKML (and other mailing lists) as well as its current checking of branches published within selected git trees. There was quite a bit of discussion of various optimizations that could be used to make this practical, but Fengguang Wu argued that it was fast enough to make such optimizations unnecessary. Fengguang also noted that the 0day Test Robot only does about 500 trees out of more than 3,000 in use.

That said, Fengguang noted the challenge of working out where to apply a given patch. James Bottomley suggested choosing based on the email list that the patch was sent to, and Josh Triplett suggested trying a series of choices until one succeeded. Fengguang decided that he should make use of the git SHA-1 index in patches generated from git.

There was some debate on the relative merits of centralized checking (as in 0day Test Robot) and of per-maintainer checking. Greg KH implicitly called out the importance of automation by pointing out that since people won't use checkpatch.pl, they probably won't use some new tool, either. Jason Cooper suggested that this sort of automation has the effect of adjusting the software/wetware boundary, and wonders if this generates real benefits as opposed to a mere shift in the type of annoyance.

Devil's Advocacy

Don't Bother Upstreaming
Standardization of Linux-Kernel Maintainership
Discourage Bad Submissions
Training vs. Selection

Don't Bother Upstreaming

Laurent Pinchart quotes Russell King, advising against trying to get each and every little thing upstream. This sparked a spirited discussion, with some arguing that this depends on the maintainer's friendliness, available time, competence, and ability to say “no”. (And yes, it was noted that those who are good at saying “no” might also be good at avoiding becoming maintainers.) Some suggested documenting maintainer style in MAINTAINERS. Others argued that too much time goes into reviewing mechanical items, while others responded that these mechanical items were also important. This naturally led to the thought that mechanical items should be checked using automated tooling, where this tooling would codify the tribal knowledge specific to a given kernel subsystem.

Of course, still others noted that some things cannot be automated, including understanding the patch, critiquing changelogs, English for non-native English speakers, and the combinatorial explosion of configurations to be tested.

Mark Brown agreed that upstreaming support for an out-of-tree SoC is difficult and time-consuming, but argues that accumulated technical debt and lack of common-code support in mainline are larger problems than is the community's unapproachability. Laurent Pinchart also pointed to lack of public documentation for many SoCs and their drivers, though this is not an issue for common code.

Standardization of Linux-Kernel Maintainership

Jon Corbet asks whether standardization of maintainership might be better than “documenting the fact that what looks like a single project is actually a collection of a hundred or so idiosyncratic fiefdoms”. Steven Rostedt argued that there already is a high degree of standardization, that the need for additional uniformity could reduced by maintainers being nicer, and proposed some standardization of the required niceness. Josh Triplett countered that additional standardization would make tree-wide fixes much easier, permitting such fixes to be merged as a unit, with Guenter Roeck seconding this thought. James Bottomley suggested that this might be managed using a “script tree” similar to the existing trivial tree. David Woodhouse noted that there have been tree-wide scripted changes, but that manual post-checking is often required. Olof Johansson suggested that selected developers act as the initial gatekeeper for new developers, though he suggested that this might have scaling difficulties.

James Bottomley noted that carefully documented process can backfire, and asked what problem we are trying to solve. Answers included that tools and processes that insist on cleanup of unimportant nits can discourage new contributors, that the community and LKML needs to improve its reputation in view of past sins, that lack of timely and self-explanatory responses to newbies is discouraging, and so on. Jon Corbet put forward the example of a would-be developer whose initials are NK as evidence of the Linux kernel community's tolerance, but noted that “local customs” are not always documented. This prompted a debate as to whether or not local customs should be globally documented on the one hand, or whether global documentation should be restricted to global customs on the other. A separate debate illustrated widely differing interpretations of and expectations from Reviewed-by.

The topic of reverse-xmas-tree ordering came up, and for those of us wondering exactly what that might mean, Chris Mason provided the following implicit definition:

We're way off in the weeds here, and whenever this topic comes up, we
end up focusing on the minor annoyances that come from submitting
new code.  Small variations in coding style, function naming and
general methods for working with a maintainer are sure to vary.

Looking at Jon's statistics, we don't have a problem bringing
new people into the kernel.  What can we do to help the new
people be more productive?  My own feeling is feedback,
testing and documentation are more important than
100% consistency between maintainers.

I love Greg's swag idea, I'm sure
more than one LF member would
be willing to help sponsor
such a thing.

-chris

Subsequent discussion made it quite clear that reverse-xmas-tree ordering was not universally welcome.

Discourage Bad Submissions

Jiri Kosina asked whether we should instead be discouraging useless submissions that are slowing maintainers down, for example, by stopping the publishing of submission statistics. Josh Triplett argued that it is the non-trivial rather than the trivial patches that are the problem, and that trivial patches should be used as trial runs to familiarize people with the mechanics of patch submission. Zefan Li agreed that the usual trivial useless submissions don't consume sufficient time, and that such mistakes won't be repeated, which resulted in objections that failing to repeat mistakes has the effect of discouraging submissions. This led to a debate on how best to teach new submitters how to work with the code base and the community, including awards and swag as encouragement.

Julia Lawall suggested that unimportant cosmetic issues could be addressed by: (1) Fixing the code and getting it over with, (2) Dropping the code from the kernel if no one is using it, or (3) :Adding a comment stating that the code is no longer being actively developed and that only bug fixes will be accepted. This led to a discussion of the downsides of cosmetic changes, including their impact on “git blame” which in turn led to a discussion of better blaming.

Training vs. Selection

Jason Cooper states that kernel hacking is hard, and with most hard jobs, “training” is more about identifying qualified people than it is about converting unqualified people into qualified people. This suggests to Jason that attracting more qualified people requires casting the net wider rather than lowering standards. Peter Huewe noted that he has seen good as well as bad people drop out, which prompted Jason to ask for specifics. This led to a discussion of what constitutes an opportunity as well as of what opportunities would be visible to various people.

John Linville gave himself as an example of a maintainer who dropped out, noting his reasons, which included having been maintainer for seven years, growing increasingly uncomfortable with his own lack of expertise on deep details of wireless operation, and the resulting lack of time to do his own development, in other words, maintainer burnout. This provoked a discussion about the relationship between code ownership and maintainership, along with the inevitable comparisons of open-source and proprietary code bases. Solutions proposed included co-maintainership, maintainer time-slicing, delegation, better hardware availability, and better hardware documentation.

Encouraging Reviews vs Encouraging Gaming

Theodore Ts'o noted that although things like checkpatch.pl should be run on patches before they are submitted, non-maintainer use of “checkpatch --file” can all too easily be abused to inflate patch counts. He also noted that any automated and transparent system can be gamed, and given incentives, will be gamed. He drew an analogy with search-engine automation, noting further that such gaming will eventually backfire once maintainers get wise to particular individuals. Guenter Roeck wondered how one would distinguish someone gaming from someone who was providing needed help across the entire system, expressing concern that any steps taken to catch gaming might also drive off developers who might otherwise provide valuable kernel clean-up help. Ted replied that although it is indeed difficult to make that distinction, he did not expect that people would be invited to the Linux Kernel Summit based on kernel cleanup. Julia Lawall suggested use of a machine-learning algorithm that takes into account other activities of the developer in question.

Rafael J. Wysocki stated that, unlike testing and reviewing, development is inherently rewarding. Rafael therefore believes that we need to actively encourage testing and review, something beyond simple encouragement such as explicit recognition. Josh Boyer suggested that patch acceptance be gated by reviews, but is concerned that this might slow down development. On the other hand, Josh suggests that this might slow down bug rates even more. Greg KH replied that the subsystems requiring reviews have not see any sort of development slowdown. Dan Carpenter agreed that SCSI's requiring reviews seemed to have actually sped up development, but wondered if there were other examples. Mark Brown believes that many subsystems informally require reviews already, but notes that measuring the effect on bug rates requires agreement on what constitutes a bug. Josh agreed, and elaborated on the bug-tracking issues that he has encountered.

Neil Brown argued that encouraging more reviews requires carrots rather than sticks, suggesting a 6-month code-review “sabbatical” with Linux Foundation, though he forsees that funding might be an issue. Peter Hüwe suggested that such sabbaticals could be very helpful in getting people up to speed on new-to-them subsystems. The ensuing discussion suggested a number of potential carrots, including top-ten lists of reviewers, recognition by the community (though it should be noted that the recognition in that case included single-malt whiskey), memorabilia, review sessions at conferences, t-shirts, coffee mugs, thumbdrives, chromecasts, and LKS invitations. Jason Cooper suggested that rewarding reviewers with hardware that is not yet supported upstream might kill two birds with one stone. Davidlohr Bueso suggested hiring consultants to do the needed revieews. Guenter Roeck noted that review effort is not always popular, in fact, in some environments is considered counterproductive: “You are holding up the release!”

Jonathan Cameron noted that reviewers are not credited, even if they do find important bugs, if they either never are happy with the patch or don't have time to repeatedly review the patch as it progresses through multiple versions. Peter Hüwe suggested addressing this issue by keeping Reviewed-by tags from reviews of prior versions of the patch.

Steven Rostedt noted that not all Reviewed-by tags are all that trustworthy, adding that he is especially suspicious of those that are given without any additional commentary. He also said that he was considering scripting an evaluation of Reviewed-by tags. This sparked a discussion, with people noting who easy a script would be to game, that it would be good to know why the reviewer liked the patch, that even a “yep looks good” could be valuable, that the perceived value of a Reviewed-by tag depends on the level of trust between the reviewer and the maintainer, that perhaps we can trust maintainers to apply meaningful Reviewed-by tags, that a Reviewed-by tag need not mean that further improvement is not possible, and that similar issues exist for Acked-by tags and commit-log messages.

Jon Corbet supplied statistics for v4.2 thus far, from which it can be seen that fewer than 25% of the patches have Reviewed-by tags.