Rafael J. Wysocki: Energy conservation bias interfaces

May 6, 2014

Participants: Amit Kucheria, Dave Jones, Morten Rasmussen, Paul Gortmaker, Peter Zijlstra, Preeti U Murthy, and Sundar Iyer.

People tagged: Amit Kucheria, Daniel Lezcano, Ingo Molnar, Morten Rasmussen, and Peter Zijlstra.

Rafael J. Wysocki suggested an “energy conservation bias” that subsystems and device drivers could refer to when making power/performance tradeoffs. For example, different actions might be appropriate under low-battery conditions than when plugged into AC, and Paul Gortmaker posted URLs for notes and the LWN summary for last year's session. Dave Jones expressed interest, saying that he felt that exporting actual CPU frequencies from cpufreq was a big mistake, and that if he was doing it again, he would probably try something like Rafael's suggestion. Dave also would like to have per-process rather than system-wide energy conservation decisions, and would prefer policy names over physical quantities such as CPU frequencies. Preeti U Murthy expanded on Dave's suggestion of policy, setting out an example of a tuned profile for switching between high performance and energy-conservation modes. Preeti also argues that a global “knob” does not suffice for cases where only some of the CPUs are needed by performance-critical tasks. Rafael generally agrees, but notes that defining the profiles and deciding when to switch between them will pose challenges. Rafael is also concerned because, in his experience, userspace often does not know enough to define and switch between various profiles. This is especially the case when there are several applications running having differing performance and energy-efficiency characteristics. Finally, Rafael asks if Preeti's approach is to start with the levels that are already provided, for example, by the cpufreq governors, and Preeti confirmed that this is the case, and clarified that her approach was to define powersave, balanced, and performance profiles. Sundar Iyer does not believe that the cpufreq approach is helpful, as it does not incorporate the relationship between performance and energy efficiency. Sundar also argues that things like WIFI already do power management on a per-device basis, and that there is no strong reason to recast these as energy-saving methods. Additionally, Sundar wonders whether limiting power consumption will always improve energy efficiency, given the possible longer runtimes.

Peter Zijlstra argued that per-task and per-cgroup knobs make no sense because it is the hardware that consumes the power. He is OK with per-subsystem, pointing out that he wants his graphics card to take the right action regardless of exactly which graphics card is present in his system. He does not like global, as he would like to (for example) disable backlight dimming but aggressively reduce compute speed. Peter is skeptical about an energy-conservation sliding scale, noting that we haven't managed to get all-or-nothing working well yet. Morten Rasmussen suggests a single global knob, so that if any task needs high performance, the system goes into high-performance mode. Userspace code would need to continuously track its requirements in order to properly influence this global setting. Morten believes that more than two or three discrete settings would be required due to the continuous nature of the power/performance tradeoff. Morten also argues that while it is the hardware that actually consumes power, it is tasks or groups of tasks with which performance requirements are associated. Peter replied that the performance requirements are QoS, which is related to but not the same as an energy-conservation bias. Morten agreed that both QoS and bias are required. Rafael argued that global controls are sometimes necessary due to hardware dependencies among what appear to be independent pieces of the hardware. In particular, Rafael believes that events like “we are on battery now” require a global response. Although Rafael would prefer a sliding scale, if 0,1 is the best that can be provided, that is what will be used.

Sundar Iyer asks whether energy-conservation bias isn't entirely platform-dependent. He gives the example of “race to halt” which translates into power conservation only given certain restricted power/performance characteristics. Sundar also asks about the possibility of a “just enough performance” metric, allowing optimization of the energy consumption subject to the performance constraints. Sundar also called out an analogy to thermal throttling, where temperature rather than energy efficiency considerations can limit performance. In response to Peter's desire to avoid backlight dimming, Sundar suggests customized preferences, noting that the backlight usually consumes more power than the CPUs. Preeti U Murthy suggests racing to idle within a power domain, pushing control across power domains only if the load exceeds a given threshold, but pushing more aggressively as performance becomes more important. Preeti believes that all-or-nothing performance-vs.-powersave choices are insufficient, that balancing is sometimes required. Preeti also contrasted thermal throttling, which is intended to avoid permanent damage to the system, with energy efficiency, which is instead intended to avoid the temporary damage of an empty battery. Sundar Iyer replied that race to idle can sometimes hurt energy efficiency, asks if Preeti is referring to generic power domains extending beyond the CPU, and reiterating the platform dependence of energy efficiency. Morten Rasmussen agreed that race to idle is not a panacea, calling out very inefficient turbo modes as a case in point. Morten agreed that thermal throttling is different than energy efficiency, but believes that they can benefit from common mechanisms. Morten also reiterated his preference for making information available to the kernel as opposed to providing single knobs or small numbers of settings. Sundar Iyer agreed on the limitations of race to idle, calling out workloads that are offloaded or accelerated as being particularly ill-suited to race to idle. Sundar also called out a number of energy-conservation tradeoffs.

Amit Kucheria argues that energy efficiency is a multi-level problem that depends on the hardware and the workload, and that there will always be corner cases that will break carefully crafted policies. In addition, Amit questions whether we want things like copying to and from memory sticks to slow down just because the system is on battery power. Amit also asks if we cannot leverage the existing generic power domains, runtime PM, and pm-qos, and then processing all this input from a centralized place, calling out Android power HAL, gnome power manager, and tuned as existing middleware power managers. Finally, Amit calls out permissions (who gets to set the policy?) and consistency (per-CPU? per-cluster?). Morten Rasmussen argues that the middleware power managers should be the ones with the power-policy-change privileges. Morten also argues that it is important to distinguish between techniques that change behavior on the one hand and optimization goals on the other. Finally, Morten advocates giving the kernel enough information about hardware and workload to figure out how to optimize towards the goal, rather than mandating specific techniques in specific circumstances.