Kyle Swenson | 8d8f654 | 2021-03-15 11:02:55 -0600 | [diff] [blame^] | 1 | Power Capping Framework |
| 2 | ================================== |
| 3 | |
| 4 | The power capping framework provides a consistent interface between the kernel |
| 5 | and the user space that allows power capping drivers to expose the settings to |
| 6 | user space in a uniform way. |
| 7 | |
| 8 | Terminology |
| 9 | ========================= |
| 10 | The framework exposes power capping devices to user space via sysfs in the |
| 11 | form of a tree of objects. The objects at the root level of the tree represent |
| 12 | 'control types', which correspond to different methods of power capping. For |
| 13 | example, the intel-rapl control type represents the Intel "Running Average |
| 14 | Power Limit" (RAPL) technology, whereas the 'idle-injection' control type |
| 15 | corresponds to the use of idle injection for controlling power. |
| 16 | |
| 17 | Power zones represent different parts of the system, which can be controlled and |
| 18 | monitored using the power capping method determined by the control type the |
| 19 | given zone belongs to. They each contain attributes for monitoring power, as |
| 20 | well as controls represented in the form of power constraints. If the parts of |
| 21 | the system represented by different power zones are hierarchical (that is, one |
| 22 | bigger part consists of multiple smaller parts that each have their own power |
| 23 | controls), those power zones may also be organized in a hierarchy with one |
| 24 | parent power zone containing multiple subzones and so on to reflect the power |
| 25 | control topology of the system. In that case, it is possible to apply power |
| 26 | capping to a set of devices together using the parent power zone and if more |
| 27 | fine grained control is required, it can be applied through the subzones. |
| 28 | |
| 29 | |
| 30 | Example sysfs interface tree: |
| 31 | |
| 32 | /sys/devices/virtual/powercap |
| 33 | ??? intel-rapl |
| 34 | ??? intel-rapl:0 |
| 35 | ? ??? constraint_0_name |
| 36 | ? ??? constraint_0_power_limit_uw |
| 37 | ? ??? constraint_0_time_window_us |
| 38 | ? ??? constraint_1_name |
| 39 | ? ??? constraint_1_power_limit_uw |
| 40 | ? ??? constraint_1_time_window_us |
| 41 | ? ??? device -> ../../intel-rapl |
| 42 | ? ??? energy_uj |
| 43 | ? ??? intel-rapl:0:0 |
| 44 | ? ? ??? constraint_0_name |
| 45 | ? ? ??? constraint_0_power_limit_uw |
| 46 | ? ? ??? constraint_0_time_window_us |
| 47 | ? ? ??? constraint_1_name |
| 48 | ? ? ??? constraint_1_power_limit_uw |
| 49 | ? ? ??? constraint_1_time_window_us |
| 50 | ? ? ??? device -> ../../intel-rapl:0 |
| 51 | ? ? ??? energy_uj |
| 52 | ? ? ??? max_energy_range_uj |
| 53 | ? ? ??? name |
| 54 | ? ? ??? enabled |
| 55 | ? ? ??? power |
| 56 | ? ? ? ??? async |
| 57 | ? ? ? [] |
| 58 | ? ? ??? subsystem -> ../../../../../../class/power_cap |
| 59 | ? ? ??? uevent |
| 60 | ? ??? intel-rapl:0:1 |
| 61 | ? ? ??? constraint_0_name |
| 62 | ? ? ??? constraint_0_power_limit_uw |
| 63 | ? ? ??? constraint_0_time_window_us |
| 64 | ? ? ??? constraint_1_name |
| 65 | ? ? ??? constraint_1_power_limit_uw |
| 66 | ? ? ??? constraint_1_time_window_us |
| 67 | ? ? ??? device -> ../../intel-rapl:0 |
| 68 | ? ? ??? energy_uj |
| 69 | ? ? ??? max_energy_range_uj |
| 70 | ? ? ??? name |
| 71 | ? ? ??? enabled |
| 72 | ? ? ??? power |
| 73 | ? ? ? ??? async |
| 74 | ? ? ? [] |
| 75 | ? ? ??? subsystem -> ../../../../../../class/power_cap |
| 76 | ? ? ??? uevent |
| 77 | ? ??? max_energy_range_uj |
| 78 | ? ??? max_power_range_uw |
| 79 | ? ??? name |
| 80 | ? ??? enabled |
| 81 | ? ??? power |
| 82 | ? ? ??? async |
| 83 | ? ? [] |
| 84 | ? ??? subsystem -> ../../../../../class/power_cap |
| 85 | ? ??? enabled |
| 86 | ? ??? uevent |
| 87 | ??? intel-rapl:1 |
| 88 | ? ??? constraint_0_name |
| 89 | ? ??? constraint_0_power_limit_uw |
| 90 | ? ??? constraint_0_time_window_us |
| 91 | ? ??? constraint_1_name |
| 92 | ? ??? constraint_1_power_limit_uw |
| 93 | ? ??? constraint_1_time_window_us |
| 94 | ? ??? device -> ../../intel-rapl |
| 95 | ? ??? energy_uj |
| 96 | ? ??? intel-rapl:1:0 |
| 97 | ? ? ??? constraint_0_name |
| 98 | ? ? ??? constraint_0_power_limit_uw |
| 99 | ? ? ??? constraint_0_time_window_us |
| 100 | ? ? ??? constraint_1_name |
| 101 | ? ? ??? constraint_1_power_limit_uw |
| 102 | ? ? ??? constraint_1_time_window_us |
| 103 | ? ? ??? device -> ../../intel-rapl:1 |
| 104 | ? ? ??? energy_uj |
| 105 | ? ? ??? max_energy_range_uj |
| 106 | ? ? ??? name |
| 107 | ? ? ??? enabled |
| 108 | ? ? ??? power |
| 109 | ? ? ? ??? async |
| 110 | ? ? ? [] |
| 111 | ? ? ??? subsystem -> ../../../../../../class/power_cap |
| 112 | ? ? ??? uevent |
| 113 | ? ??? intel-rapl:1:1 |
| 114 | ? ? ??? constraint_0_name |
| 115 | ? ? ??? constraint_0_power_limit_uw |
| 116 | ? ? ??? constraint_0_time_window_us |
| 117 | ? ? ??? constraint_1_name |
| 118 | ? ? ??? constraint_1_power_limit_uw |
| 119 | ? ? ??? constraint_1_time_window_us |
| 120 | ? ? ??? device -> ../../intel-rapl:1 |
| 121 | ? ? ??? energy_uj |
| 122 | ? ? ??? max_energy_range_uj |
| 123 | ? ? ??? name |
| 124 | ? ? ??? enabled |
| 125 | ? ? ??? power |
| 126 | ? ? ? ??? async |
| 127 | ? ? ? [] |
| 128 | ? ? ??? subsystem -> ../../../../../../class/power_cap |
| 129 | ? ? ??? uevent |
| 130 | ? ??? max_energy_range_uj |
| 131 | ? ??? max_power_range_uw |
| 132 | ? ??? name |
| 133 | ? ??? enabled |
| 134 | ? ??? power |
| 135 | ? ? ??? async |
| 136 | ? ? [] |
| 137 | ? ??? subsystem -> ../../../../../class/power_cap |
| 138 | ? ??? uevent |
| 139 | ??? power |
| 140 | ? ??? async |
| 141 | ? [] |
| 142 | ??? subsystem -> ../../../../class/power_cap |
| 143 | ??? enabled |
| 144 | ??? uevent |
| 145 | |
| 146 | The above example illustrates a case in which the Intel RAPL technology, |
| 147 | available in Intel® IA-64 and IA-32 Processor Architectures, is used. There is one |
| 148 | control type called intel-rapl which contains two power zones, intel-rapl:0 and |
| 149 | intel-rapl:1, representing CPU packages. Each of these power zones contains |
| 150 | two subzones, intel-rapl:j:0 and intel-rapl:j:1 (j = 0, 1), representing the |
| 151 | "core" and the "uncore" parts of the given CPU package, respectively. All of |
| 152 | the zones and subzones contain energy monitoring attributes (energy_uj, |
| 153 | max_energy_range_uj) and constraint attributes (constraint_*) allowing controls |
| 154 | to be applied (the constraints in the 'package' power zones apply to the whole |
| 155 | CPU packages and the subzone constraints only apply to the respective parts of |
| 156 | the given package individually). Since Intel RAPL doesn't provide instantaneous |
| 157 | power value, there is no power_uw attribute. |
| 158 | |
| 159 | In addition to that, each power zone contains a name attribute, allowing the |
| 160 | part of the system represented by that zone to be identified. |
| 161 | For example: |
| 162 | |
| 163 | cat /sys/class/power_cap/intel-rapl/intel-rapl:0/name |
| 164 | package-0 |
| 165 | |
| 166 | The Intel RAPL technology allows two constraints, short term and long term, |
| 167 | with two different time windows to be applied to each power zone. Thus for |
| 168 | each zone there are 2 attributes representing the constraint names, 2 power |
| 169 | limits and 2 attributes representing the sizes of the time windows. Such that, |
| 170 | constraint_j_* attributes correspond to the jth constraint (j = 0,1). |
| 171 | |
| 172 | For example: |
| 173 | constraint_0_name |
| 174 | constraint_0_power_limit_uw |
| 175 | constraint_0_time_window_us |
| 176 | constraint_1_name |
| 177 | constraint_1_power_limit_uw |
| 178 | constraint_1_time_window_us |
| 179 | |
| 180 | Power Zone Attributes |
| 181 | ================================= |
| 182 | Monitoring attributes |
| 183 | ---------------------- |
| 184 | |
| 185 | energy_uj (rw): Current energy counter in micro joules. Write "0" to reset. |
| 186 | If the counter can not be reset, then this attribute is read only. |
| 187 | |
| 188 | max_energy_range_uj (ro): Range of the above energy counter in micro-joules. |
| 189 | |
| 190 | power_uw (ro): Current power in micro watts. |
| 191 | |
| 192 | max_power_range_uw (ro): Range of the above power value in micro-watts. |
| 193 | |
| 194 | name (ro): Name of this power zone. |
| 195 | |
| 196 | It is possible that some domains have both power ranges and energy counter ranges; |
| 197 | however, only one is mandatory. |
| 198 | |
| 199 | Constraints |
| 200 | ---------------- |
| 201 | constraint_X_power_limit_uw (rw): Power limit in micro watts, which should be |
| 202 | applicable for the time window specified by "constraint_X_time_window_us". |
| 203 | |
| 204 | constraint_X_time_window_us (rw): Time window in micro seconds. |
| 205 | |
| 206 | constraint_X_name (ro): An optional name of the constraint |
| 207 | |
| 208 | constraint_X_max_power_uw(ro): Maximum allowed power in micro watts. |
| 209 | |
| 210 | constraint_X_min_power_uw(ro): Minimum allowed power in micro watts. |
| 211 | |
| 212 | constraint_X_max_time_window_us(ro): Maximum allowed time window in micro seconds. |
| 213 | |
| 214 | constraint_X_min_time_window_us(ro): Minimum allowed time window in micro seconds. |
| 215 | |
| 216 | Except power_limit_uw and time_window_us other fields are optional. |
| 217 | |
| 218 | Common zone and control type attributes |
| 219 | ---------------------------------------- |
| 220 | enabled (rw): Enable/Disable controls at zone level or for all zones using |
| 221 | a control type. |
| 222 | |
| 223 | Power Cap Client Driver Interface |
| 224 | ================================== |
| 225 | The API summary: |
| 226 | |
| 227 | Call powercap_register_control_type() to register control type object. |
| 228 | Call powercap_register_zone() to register a power zone (under a given |
| 229 | control type), either as a top-level power zone or as a subzone of another |
| 230 | power zone registered earlier. |
| 231 | The number of constraints in a power zone and the corresponding callbacks have |
| 232 | to be defined prior to calling powercap_register_zone() to register that zone. |
| 233 | |
| 234 | To Free a power zone call powercap_unregister_zone(). |
| 235 | To free a control type object call powercap_unregister_control_type(). |
| 236 | Detailed API can be generated using kernel-doc on include/linux/powercap.h. |