James Bottomley: Encouraging more reviewers

May 24, 2014

Participants: Andy Lutomirski, Ben Hutchings, Dan Carpenter, Daniel Vetter, Dan Williams, Frank Rowand, Geert Uytterhoeven, Greg KH, Guenter Roeck, Jiri Kosina, John W. Linville, Jonathan Cameron, Jonathan Corbet, Josh Triplett, Laurent Pinchart, Li Zefan, Lukáš Czerner, Mark Brown, Martin K. Petersen, Mimi Zohar, Neil Brown, Olof Johansson, Paul Walmsley, Rafael J. Wysocki, Randy Dunlap, Rob Herring, Stephen Rothwell, Steven Rostedt, Thomas Gleixner, Tony Luck, Trond Myklebust, and Wolfram Sang.

People tagged: Andrew Morton.

Special mention: Paul Walmsley's maintainers per kernel entity list.

James Bottomley pointed out that reviews are not keeping up with patches, especially for reposts of large patch series. If you reviewed the initial series, it is often not easy to see what (if any) of your feedback was accepted. James suggests an addition to SubmittingPatches describing how to resend after addressing review comments. James also reluctantly suggests requiring that people who submit patches be required to review other patches, a review session instead of a hacking session, and publishing a per-subsystem to-review list. Wolfram Sang added that reviews from newbie reviewers must themselves be reviewed, and argued that enforced reviews were likely to be sloppy reviews. Wolfram also pointed out that patchworks already provides to-review lists, (though Dan Williams and Rafael J. Wysocki feels that patchwork needs more work), and wished for paid reviewers/maintainers. Guenter Roeck objected that company hierarchies reward code production over code review, and that vigorous code reviewers might even be blamed for project delays. Neil Brown suggested that Guenter's objection could be addressed by having Linux Foundation take on the role of reviewer paymaster. James Bottomley agreed that Guenter's objection might be valid at startups, but believes that more mature companies understand the value of quality as well as the value of features. In fact, James has seen situations where QA won all the prizes based on their early identification of defects in buggy submissions. James also noted that incenting QA exclusively on finding defects can also be problematic [ Ed. note: Obligatory Dilbert reference: “I am going to code me up a minivan!!!” ], but that it might be worthwhile having LF offer a small incentive for bug-finding, an idea that Guenter Roeck liked, with Rafael J. Wysocki and Wolfram Sang noting the benefits of having reviewers who are seen as being independent of any given company. Guenter Roeck isn't so sure that all that many companies have learned to appreciate quality, asking whether prospective employers are more likely to count Signed-off-by: tags on the one hand or Reviewed-by: tags on the other. Although Guenter agrees that developers are usually blamed for buggy code, he doubts that they are blamed much for bad architecture, structure, or implementation. Guenter has also see QA engineers be reprimanded for finding bugs without finding a well-defined script, on the ground that such bugs were hard to reproduce. That said, Guenter does hold out some hope based on the fact that companies are now willing to employ maintainers, who by definition spend a lot of their time doing code reviews.

Lukáš Czerner questioned the concept of dedicated reviewers, arguing that you have to be a good developer to be a good reviewer. Lukáš instead suggests treating the Reviewed-by: tag with as much respect as the Signed-off-by: tag. Rafael J. Wysocki further feels that dedicated reviewers who don't write good code become less reliable over time, which means that reviewers need to be active developers as well. Rafael also argues that Reviewed-by: tags should not be merely given as much respect as Signed-off-by: tags, but actually more respect. This led Andy Lutomirski whether the Signed-off-by: tag now means something different than “I affirm what the DCO says”. Rafael J. Wysocki agreed that this is all that it means formally, but that from a practical standpoint, Rafael also expects a commitment to stick around for awhile to fix any related issues. Daniel Vetter agrees that Reviewed-by: tags should get as much respect as Signed-off-by: tags, but doesn't want to encourage rubber-stamp reviews or gaming of the system. Guenter Roeck suggests that people who might otherwise be tempted to carry out rubber-stamp reviews go read the “Reviewer's statement of oversight” in Documentation/SubmittingPatches, urging that people not hand out their Reviewed-by: tag lightly. Mimi Zohar noted that there are other tags available for less in-depth review, to which Guenter Roeck agreed, calling out Acked-by: as an example, as did Rafael J. Wysocki. Mimi suggested adding Commented-by: for the case where there is substantive feedback (unlike Acked-by:), but where the Reviewed-by: tag's standards have not been met. Josh Triplett suggested that these tags might be highlighed by LWN to encourage additional reviews. Jonathan Corbet replied that he had done that occasionally, but that both the numbers and interest had been low. However, Jon said that he would revisit this. Thomas Gleixner objected that such highlighting would encourage gaming, which puts pressure on maintainers to quickly educate submitters of half-baked patches, where the submitters are more concerned about hitting their targets than in learning how to submit good patches. For some reason, Thomas does not believe that giving an infinite number of monkeys an infinite number of keyboards is an efficient way to generate good kernel patches. Josh Triplett agreed that abuse could occur, he feels that statistics-based pressure is in the noise compare to pressure to meet development milestones. That said, Josh believes that people could do a better job of breaking up large patches, though Mimi noted that this is not always possible, instead suggesting that patches in a series be related, which Randy Dunlap noted is already required by Documentation/SubmittingPatches (Mimi thanked Randy for this reminder). Josh also asked if Thomas was most annnoyed by: (1) People resubmitting the series before Thomas's review was complete, (2) Patches containing fundamental problems requiring on-the-spot remedial training, or (3) Email review of a 30-patch series was too painful (in which case, Josh suggests git-based workflows). Thomas Gleixner said that he has repeatedly seen statistics-based pressure in writing, and while he agrees that development milestones also produce pressure, neither source of pressure justifies offloading training to maintainers and reviewers. Thomas also said that he prefers reviewing patches in email vs. git-based tools, arguing that better review tooling isn't going to help with low-quality patches.

Dan Carpenter suggests a special tag for a review that actually found a bug. Daniel Vetter would rather see credit in the commit log, holding up d978ef14456a38034f6c0e as a good example, but also callling for more standardization. Mark Brown is concerned that the changelogs will get large and noisy, and that Daniel Vetter's approach assumes that the submitters will be conscientious, which in Mark's experience is not always the case. Mimi Zohar agreed, but believes that recognition of bug-finding reviewers is important, but Mark makes a sharp distinction between thanking people for reviews on the one hand and including a blow-by-blow history of the patch on the other. Dan Carpenter replied that only real bug fixes should be acknowledged in the commit log, adding that nit-picking code should be its own reward.

Tony Luck pointed out that Documentation/SubmittingPatches already calls for a changelist after the --- marker, but Mark replied that Daniel wants the changelist in the git logs, which doesn't happen if it follows the --- marker. Thomas Gleixner pointed to the Link: tag, which is used to link back to the commit's earlier email discussion, which Li Zefan would like to see used more widely. Steven Rostedt went farther, arguing that reposts of patches should include a separate Link: tag for each of the previous versions of that patch. Greg KH noted that a research group has produced a tool that, given a git commit, finds the related email discussion. [ Ed. note: Perhaps it is this one. ] Li Zefan liked the tool, saying that it would have eased a recent backporting effort.

Daniel Vetter likes “noisy” commit logs that contain blow-by-blow history, as they help him understand how the patch came to be and allow him to scold patch authors for ignoring reviewers. Laurent Pinchart agreed that checking up on whether patch authors took reviews into account can be quite time-consuming, and would welcome any automation. Steven Rostedt pointed out that this was one of the big benefits of the aforementioned Link: tag.

James Bottomley returned to the topic of patch statistics, noting that his employer tracks features rather than patches as a way of avoiding some of the gaming. James also pointed out that OpenStack avoids blizzards of trivial patches by making submission of individual patches more time-consuming. James also noted that some large patch sets are generated by semantic patch scripts, and in that case, careful review of the script itself might save quite a bit of time, as a bug in the script should be fixed and the patches regenerated, avoiding wasting time reviewing large numbers of bogus patches. Greg KH believes that OpenStack's more onerous patch-submission process will eventually hurt them, and that trivial cleanup patches make it easier for people to join the Linux kernel community. Li Zefan agreed that people find submitting bug fixes easy, but has more trouble encouraging them to do code reviews, perhaps due to lack of confidence, language barriers, Asian cultural issues, lack of recognition for reviews, and being busy with internal jobs. James Bottomley argues that the trivial fixes do consume scarce maintainership bandwidth, but is OK with it as long as it is accompanied by substantive changes or a willingness to take on some maintainership tasks. Greg KH agreed that large numbers of cleanup patches to old little-used code could be annoying, and suggested that James do a preemptive code-cleanup cycle to avoid such patches. James Bottomley said that his concern was instead that although cleanup patches were good for getting people into the kernel, he did not believe that it was easy to progress from that point. James instead believes that bug-fixing is the better kernel-hacking gateway drug. John W. Linville agreed with James, noting that people fixing bugs might have incentive to keep new bugs out, while people looking for patch counts via trivial fixes might be happy to let bugs slip through in order to give them something with which to meet next month's quota. Dan Carpenter uses a script to help review cleanup patches, allowing him to quickly locate (perhaps inadvertent) substantive changes in a patch full of cleanups.

Steven Rostedt noted that across-the-board changes often require large numbers of patches because each patch must CC a different set of maintainers. CCing all the maintainers on one big patch will exceed LKML's CC limit, preventing your patch from reaching LKML.

Daniel Vetter is willing to cut hobbyists and interns a break, but if he sees garbage patches from a given corporation, he escalates to that corporation's management very quickly (in his experience, sharply worded emails to the engineer in question do not suffice). At least he does so when he has contacts into that corporation's management. Daniel would like to see a list of contacts for other corporations for this purpose. Greg KH noted that the Linux Foundation does have such a list, and that emailing someone (including Greg) on the LF Tech Board is the proper way to handle this sort of problem.

Tony Luck was thankful that nobody sends him 30-patch series for IA64, but would try managing the process using topic branches in git, adding patches as they met the mark. Wolfram Sang pointed out that this process is much more difficult when the reviewer does not have access to the hardware, which is the case for many device drivers. Wolfram also believes that dedicated maintainership is more practical than dedicated reviewership. Lukáš Czerner made a distinction between reviewers not having specific hardware knowledge and reviewers lacking design/code acumen. Lukáš objected to maintainers taking on the entire reviewing responsibility to the exclusion of hacking, arguing that we need developers to take on some of this burden. Wolfram Sang pointed out that, in addition to lacking acumen and lacking specific hardware knowledge, lack of time and interest were also important obstacles. Wolfram also believes that maintainership will always involve some hacking, and does not believe that developer-based review will scale sufficiently.

Trond Myklebust argued that arranging reviews should be the responsibility of the patch submitter rather than of the maintainer, though of course the maintainer can and should assist where needed. That said, Trond agrees that assuring quality of the reviews is a challenge. James Bottomley said that while he has no problem with the submitter bearing this responsibility, he feels that the maintainer is the reviewer of last resort, and, according to H. Peter Anvin, often the only reviewer available. James also feels that the division of review responsibility will vary across subsystems. Bjorn Helgaas finds that if someone submits patches that are of wide interest, reviewers will appear. However, most people are not all that interested in changes to core architecture, so Bjorn ends up with a longish patch-review queue. In addition, the core architecture tends to accumulate band-aids that fixed specific problems, and this accumulated technical debt tends to make it more difficult to review patches to the core. Bjorn likes keeping reviews in email archives, though he suspects that this might be because his reviews tend to be 1x1 with the patch author. Rafael J. Wysocki noted that deep changes to core code require deep and broad knowledge to do a good job of reviewing.

Stephen Rothwell provided the following statistics:

Li Zefan noted that some of the Acked-by: tags really corresponded to full reviews (as in Li Zefan's reviews of Tejun's cgroups patches), and that some reviews go unacknowledged. Wolfram Sang presented his own statistics [PDF] claiming that the developer-to-maintainer ratio is increasing, which could be making the review problem worse.

Dan Carpenter suggested that James Bottomley's SCSI subsystem was almost uniquely difficult, having no viable candidate for second in command. Dan expressed concern that SCSI would be in trouble should James decide to go back to Cambridge for an MBA. James Bottomley reassured Dan that an MBA is not an essential qualification for James's current CTO position. James also laid out SCSI's review strategy, which scrutinizes core code much more closely than individual per-vendor driver code. The latter gets a checkpatch.pl and a quick scan, while core code such as transport classes and libsas are examined much more closely. Jiri Kosina asked that driver maintainers be required to contribute better commit logs, but James replied that it was hard enough to get the driver writers to contribute just the code (and Neil Brown wondered whether such under-duress changelogs would be of any value anyway), and James stated that it was hard enough to get the driver writers to contribute even the code. That said, James does enforce higher commit-log standards for the core code. Jiri was not amused (which caused James to point out that distro people such as Jiri were more than welcome to help with review and changelog creation), but Martin K. Petersen sympathized with James, pointing out a number of issues that could result in vestigal commit logs. However, Martin suggested that improvement was in order, suggesting a few educational techniques that might help, culminating with sending Greg KH in wielding a cluebat. Greg expressed an almost ominous level of enthusiasm for this approach, but Rafael J. Wysocki prefers the point-of-view gun featured in “Hitchhiker's Guide to the Galaxy”. Dan Williams pointed out that too much pushback on patches could cause vendors to back away from sending anything upstream. Jiri Kosina suggested that in that case, OS vendors might decline to support the HW vendor's drivers, and expressed hope that customers might start asking hard questions of the HW vendor. Dan Williams did not share Jiri's hope, noting that customers will instead simply complain to whoever seems most willing to listen. Dan also noted that HW vendors can sometimes have difficulties with core libraries due to differing update cadences. Martin K. Petersen noted that there is a big difference between bug fixes and “value add” features that were simply too ugly to go upstream, and noted that OS vendors have some leverage in the form of nonsupport. Dan Williams argued that agreements between HW vendors and enterprise distributions nullify that leverage, but agreed that ill-conceived “value add” can be quite toxic. However, Dan argued that permitting more experimentation upstream could help bring more patches upstream. Martin replied that driver vendors are not all that likely to collaborate, even in the Linux kernel space. Dan Williams noted that if vendors put their experiments upstream, others could refactor out the common code. Ben Hutchings notes that some enterprise drivers remain out-of-tree in order to support older enterprise-distribution kernels, but feels that the situation with enterprise hardware is better than that of consumer electronics, because consumer devices are often never updated, so there is less emphasis on long-term solutions. In addition, some consumer-electronics devices require hacks to the core kernel that are not likely to ever be accepted. Finally, consumer-electronics development is usually done privately for quite some time before the device is released, so that significant work is required to make the patches ready for upstream acceptance. Martin suggests Cc: stable for support of older kernels, and agrees that new drivers can be difficult to get upstream the first time. However, Martin is also concerned about existing drivers that either have bitrotted or that would never be accepted given today's more strict coding standards. Mark Brown agrees that very few vendors in the consumer-electronics space care about upstream, in part because upstream is not particularly useful to them. These vendors instead use whatever Android kernel was current at the time work started on the relevant SoC. However, there is some pressure to work more with upstream, in part to better reuse code for newer versions of particular devices and in part to leverage upstream review. Frank Rowand says that portions of Sony stay very close to mainline, in some cases even starting development with a -rc1 release of the kernel. They also encourage their SoC vendors to work in mainline instead of vendor trees. Mark Brown agreed that both Frank and the relevant portions of Sony qualified as one of the “very few vendors” that Mark had called out.

Paul Walmsley suggested that reviewers be recognized with an R: tag in the MAINTAINERS file. Josh Triplett agreed, and suggestd that such a tag would also be helpful to get_maintainer.pl when generating Cc: lists. Josh noted that mailing lists can also be used for this purpose, but pointed out that lists can diffuse responsibility. Rob Herring felt that anyone trusted enough to be a designated reviewer should also be considered a co-maintainer. Olof Johansson argued that there is such a thing as too many maintainers, and also argued that there are very different expectations on a maintainer than on a reviewer, this latter point being amplified by Paul Walmsley. Paul Walmsley agreed, noting that maintainers are expected to set policy, while reviewers need only follow the policies set by the maintainer. Jonathan Cameron added that some employers might not be happy about their employees taking on maintainer roles, but might be comfortable with the role of designated reviewer.