Andy Lutomirski: Lightweight per-cpu locks / restartable sequences

July 29, 2015

Related Material:

  1. 2013 Linux Plumbers Conference slideset: LPC - PerCpu Atomics
  2. May 21, 2015 Per-CPU System Call patch: percpu system call: fast userspace percpu critical sections (from Mathieu Desnoyers)
  3. July 7, 2015 LWN article: Restartable sequences

Additional Participants: Chris Mason, Christoph Lameter, Lai Jiangshan, and Peter Zijlstra.

People tagged: Jens Axboe, Jon Corbet, Mathieu Desnoyers, Paul E. McKenney, and Shaohua Li.

Andy Lutomirski suggests a discussion of a light-weight mechanism permitting user-mode code to implement per-CPU operations, calling out Paul Turner's patch, Mathieu Desnoyers's patch, and his own approach of using %gs on x86. Chris Mason said that his group has started experimenting with these patches and hopes to have performance data from production workloads soonish, which Christoph Lameter applauded, and suggested might also be applied in-kernel. Peter Zijlstra replied that in-kernel experimentation need not wait on an API, and argued that in-kernel use could rely on interrupt hooks instead of scheduler hooks. However, Peter suspects that forcing function calls for these operations will eat up much of the potential performance gains. Finally, Peter believes that %gs prefixes will have substantial performance advantages. Christoph responded that one could avoid function-call overhead by moving the calling function into the special code region and that some of the non-%gs approaches might avoid the implicit memory barriers that degrade performance of read-modify-write instructions on x86. Andy agreed that read-modify-write instructions can be slow, but that cmpxchg is pretty fast. Andy also suggested per-CPU memory mappings as a self-described crazy idea. Christoph liked the per-CPU memory mappings, noting that this had been done on Itanium, but that x86 would require a separate page table for each CPU for each task.

Lai Jiangshan called out anohter disadvantage of a special code region, namely that all functions in that region must avoid invoking functions outside that region, however, he agrees that doing this simplifies scheduler hooks. Lai also notes that in-kernel application of these techniques could simplify NMI handlers.