Merge tag 'pci-v5.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

author Linus Torvalds <torvalds@linux-foundation.org>

Tue, 16 Jul 2019 03:44:49 +0000 (20:44 -0700)

committer Linus Torvalds <torvalds@linux-foundation.org>

Tue, 16 Jul 2019 03:44:49 +0000 (20:44 -0700)
author Linus Torvalds <torvalds@linux-foundation.org>
Tue, 16 Jul 2019 03:44:49 +0000 (20:44 -0700)
committer Linus Torvalds <torvalds@linux-foundation.org>
Tue, 16 Jul 2019 03:44:49 +0000 (20:44 -0700)
diff --cc Documentation/ABI/testing/sysfs-class-powercap
Simple merge
diff --cc Documentation/admin-guide/kernel-parameters.txt
Simple merge
diff --cc Documentation/index.rst

index 71a77feb779b6a7a104b3d59d4163f409b909ebf,4afa431d9b1f8c1301f547f21f141db08aa45b36..00a0fe4241a4239118805b85f26a4bce0a9bd40b
--- 1/Documentation/index.rst
--- 2/Documentation/index.rst
+++ b/Documentation/index.rst
@@@ -102,7 -101,7 +102,8 @@@ needed)
      filesystems/index
      vm/index
      bpf/index
+ +   usb/index
+    PCI/index
      misc-devices/index
   
   Architecture-specific documentation
diff --cc Documentation/scheduler/sched-energy.rst

index fce5858c90827caedd2804cbb7bf98ba30a5812b,0000000000000000000000000000000000000000..9580c57a52bc3316508f5cf9e68ea84f0945f46a

mode 100644,000000..100644
--- 1/Documentation/scheduler/sched-energy.rst
--- /dev/null
+++ b/Documentation/scheduler/sched-energy.rst
@@@ -1,430 -1,0 +1,430 @@@
- please refer to its documentation (see Documentation/power/energy-model.txt).
+ +=======================
+ +Energy Aware Scheduling
+ +=======================
+ +
+ +1. Introduction
+ +---------------
+ +
+ +Energy Aware Scheduling (or EAS) gives the scheduler the ability to predict
+ +the impact of its decisions on the energy consumed by CPUs. EAS relies on an
+ +Energy Model (EM) of the CPUs to select an energy efficient CPU for each task,
+ +with a minimal impact on throughput. This document aims at providing an
+ +introduction on how EAS works, what are the main design decisions behind it, and
+ +details what is needed to get it to run.
+ +
+ +Before going any further, please note that at the time of writing::
+ +
+ +   /!\ EAS does not support platforms with symmetric CPU topologies /!\
+ +
+ +EAS operates only on heterogeneous CPU topologies (such as Arm big.LITTLE)
+ +because this is where the potential for saving energy through scheduling is
+ +the highest.
+ +
+ +The actual EM used by EAS is _not_ maintained by the scheduler, but by a
+ +dedicated framework. For details about this framework and what it provides,
- per 'performance domain' in the system (see Documentation/power/energy-model.txt
++please refer to its documentation (see Documentation/power/energy-model.rst).
+ +
+ +
+ +2. Background and Terminology
+ +-----------------------------
+ +
+ +To make it clear from the start:
+ + - energy = [joule] (resource like a battery on powered devices)
+ + - power = energy/time = [joule/second] = [watt]
+ +
+ +The goal of EAS is to minimize energy, while still getting the job done. That
+ +is, we want to maximize::
+ +
+ +      performance [inst/s]
+ +      --------------------
+ +          power [W]
+ +
+ +which is equivalent to minimizing::
+ +
+ +      energy [J]
+ +      -----------
+ +      instruction
+ +
+ +while still getting 'good' performance. It is essentially an alternative
+ +optimization objective to the current performance-only objective for the
+ +scheduler. This alternative considers two objectives: energy-efficiency and
+ +performance.
+ +
+ +The idea behind introducing an EM is to allow the scheduler to evaluate the
+ +implications of its decisions rather than blindly applying energy-saving
+ +techniques that may have positive effects only on some platforms. At the same
+ +time, the EM must be as simple as possible to minimize the scheduler latency
+ +impact.
+ +
+ +In short, EAS changes the way CFS tasks are assigned to CPUs. When it is time
+ +for the scheduler to decide where a task should run (during wake-up), the EM
+ +is used to break the tie between several good CPU candidates and pick the one
+ +that is predicted to yield the best energy consumption without harming the
+ +system's throughput. The predictions made by EAS rely on specific elements of
+ +knowledge about the platform's topology, which include the 'capacity' of CPUs,
+ +and their respective energy costs.
+ +
+ +
+ +3. Topology information
+ +-----------------------
+ +
+ +EAS (as well as the rest of the scheduler) uses the notion of 'capacity' to
+ +differentiate CPUs with different computing throughput. The 'capacity' of a CPU
+ +represents the amount of work it can absorb when running at its highest
+ +frequency compared to the most capable CPU of the system. Capacity values are
+ +normalized in a 1024 range, and are comparable with the utilization signals of
+ +tasks and CPUs computed by the Per-Entity Load Tracking (PELT) mechanism. Thanks
+ +to capacity and utilization values, EAS is able to estimate how big/busy a
+ +task/CPU is, and to take this into consideration when evaluating performance vs
+ +energy trade-offs. The capacity of CPUs is provided via arch-specific code
+ +through the arch_scale_cpu_capacity() callback.
+ +
+ +The rest of platform knowledge used by EAS is directly read from the Energy
+ +Model (EM) framework. The EM of a platform is composed of a power cost table
- independent EM framework in Documentation/power/energy-model.txt.
++per 'performance domain' in the system (see Documentation/power/energy-model.rst
+ +for futher details about performance domains).
+ +
+ +The scheduler manages references to the EM objects in the topology code when the
+ +scheduling domains are built, or re-built. For each root domain (rd), the
+ +scheduler maintains a singly linked list of all performance domains intersecting
+ +the current rd->span. Each node in the list contains a pointer to a struct
+ +em_perf_domain as provided by the EM framework.
+ +
+ +The lists are attached to the root domains in order to cope with exclusive
+ +cpuset configurations. Since the boundaries of exclusive cpusets do not
+ +necessarily match those of performance domains, the lists of different root
+ +domains can contain duplicate elements.
+ +
+ +Example 1.
+ +    Let us consider a platform with 12 CPUs, split in 3 performance domains
+ +    (pd0, pd4 and pd8), organized as follows::
+ +
+ +                CPUs:   0 1 2 3 4 5 6 7 8 9 10 11
+ +                PDs:   |--pd0--|--pd4--|---pd8---|
+ +                RDs:   |----rd1----|-----rd2-----|
+ +
+ +    Now, consider that userspace decided to split the system with two
+ +    exclusive cpusets, hence creating two independent root domains, each
+ +    containing 6 CPUs. The two root domains are denoted rd1 and rd2 in the
+ +    above figure. Since pd4 intersects with both rd1 and rd2, it will be
+ +    present in the linked list '->pd' attached to each of them:
+ +
+ +       * rd1->pd: pd0 -> pd4
+ +       * rd2->pd: pd4 -> pd8
+ +
+ +    Please note that the scheduler will create two duplicate list nodes for
+ +    pd4 (one for each list). However, both just hold a pointer to the same
+ +    shared data structure of the EM framework.
+ +
+ +Since the access to these lists can happen concurrently with hotplug and other
+ +things, they are protected by RCU, like the rest of topology structures
+ +manipulated by the scheduler.
+ +
+ +EAS also maintains a static key (sched_energy_present) which is enabled when at
+ +least one root domain meets all conditions for EAS to start. Those conditions
+ +are summarized in Section 6.
+ +
+ +
+ +4. Energy-Aware task placement
+ +------------------------------
+ +
+ +EAS overrides the CFS task wake-up balancing code. It uses the EM of the
+ +platform and the PELT signals to choose an energy-efficient target CPU during
+ +wake-up balance. When EAS is enabled, select_task_rq_fair() calls
+ +find_energy_efficient_cpu() to do the placement decision. This function looks
+ +for the CPU with the highest spare capacity (CPU capacity - CPU utilization) in
+ +each performance domain since it is the one which will allow us to keep the
+ +frequency the lowest. Then, the function checks if placing the task there could
+ +save energy compared to leaving it on prev_cpu, i.e. the CPU where the task ran
+ +in its previous activation.
+ +
+ +find_energy_efficient_cpu() uses compute_energy() to estimate what will be the
+ +energy consumed by the system if the waking task was migrated. compute_energy()
+ +looks at the current utilization landscape of the CPUs and adjusts it to
+ +'simulate' the task migration. The EM framework provides the em_pd_energy() API
+ +which computes the expected energy consumption of each performance domain for
+ +the given utilization landscape.
+ +
+ +An example of energy-optimized task placement decision is detailed below.
+ +
+ +Example 2.
+ +    Let us consider a (fake) platform with 2 independent performance domains
+ +    composed of two CPUs each. CPU0 and CPU1 are little CPUs; CPU2 and CPU3
+ +    are big.
+ +
+ +    The scheduler must decide where to place a task P whose util_avg = 200
+ +    and prev_cpu = 0.
+ +
+ +    The current utilization landscape of the CPUs is depicted on the graph
+ +    below. CPUs 0-3 have a util_avg of 400, 100, 600 and 500 respectively
+ +    Each performance domain has three Operating Performance Points (OPPs).
+ +    The CPU capacity and power cost associated with each OPP is listed in
+ +    the Energy Model table. The util_avg of P is shown on the figures
+ +    below as 'PP'::
+ +
+ +     CPU util.
+ +      1024                 - - - - - - -              Energy Model
+ +                                               +-----------+-------------+
+ +                                               |  Little   |     Big     |
+ +       768                 =============       +-----+-----+------+------+
+ +                                               | Cap | Pwr | Cap  | Pwr  |
+ +                                               +-----+-----+------+------+
+ +       512  ===========    - ##- - - - -       | 170 | 50  | 512  | 400  |
+ +                             ##     ##         | 341 | 150 | 768  | 800  |
+ +       341  -PP - - - -      ##     ##         | 512 | 300 | 1024 | 1700 |
+ +             PP              ##     ##         +-----+-----+------+------+
+ +       170  -## - - - -      ##     ##
+ +             ##     ##       ##     ##
+ +           ------------    -------------
+ +            CPU0   CPU1     CPU2   CPU3
+ +
+ +      Current OPP: =====       Other OPP: - - -     util_avg (100 each): ##
+ +
+ +
+ +    find_energy_efficient_cpu() will first look for the CPUs with the
+ +    maximum spare capacity in the two performance domains. In this example,
+ +    CPU1 and CPU3. Then it will estimate the energy of the system if P was
+ +    placed on either of them, and check if that would save some energy
+ +    compared to leaving P on CPU0. EAS assumes that OPPs follow utilization
+ +    (which is coherent with the behaviour of the schedutil CPUFreq
+ +    governor, see Section 6. for more details on this topic).
+ +
+ +    **Case 1. P is migrated to CPU1**::
+ +
+ +      1024                 - - - - - - -
+ +
+ +                                            Energy calculation:
+ +       768                 =============     * CPU0: 200 / 341 * 150 = 88
+ +                                             * CPU1: 300 / 341 * 150 = 131
+ +                                             * CPU2: 600 / 768 * 800 = 625
+ +       512  - - - - - -    - ##- - - - -     * CPU3: 500 / 768 * 800 = 520
+ +                             ##     ##          => total_energy = 1364
+ +       341  ===========      ##     ##
+ +                    PP       ##     ##
+ +       170  -## - - PP-      ##     ##
+ +             ##     ##       ##     ##
+ +           ------------    -------------
+ +            CPU0   CPU1     CPU2   CPU3
+ +
+ +
+ +    **Case 2. P is migrated to CPU3**::
+ +
+ +      1024                 - - - - - - -
+ +
+ +                                            Energy calculation:
+ +       768                 =============     * CPU0: 200 / 341 * 150 = 88
+ +                                             * CPU1: 100 / 341 * 150 = 43
+ +                                    PP       * CPU2: 600 / 768 * 800 = 625
+ +       512  - - - - - -    - ##- - -PP -     * CPU3: 700 / 768 * 800 = 729
+ +                             ##     ##          => total_energy = 1485
+ +       341  ===========      ##     ##
+ +                             ##     ##
+ +       170  -## - - - -      ##     ##
+ +             ##     ##       ##     ##
+ +           ------------    -------------
+ +            CPU0   CPU1     CPU2   CPU3
+ +
+ +
+ +    **Case 3. P stays on prev_cpu / CPU 0**::
+ +
+ +      1024                 - - - - - - -
+ +
+ +                                            Energy calculation:
+ +       768                 =============     * CPU0: 400 / 512 * 300 = 234
+ +                                             * CPU1: 100 / 512 * 300 = 58
+ +                                             * CPU2: 600 / 768 * 800 = 625
+ +       512  ===========    - ##- - - - -     * CPU3: 500 / 768 * 800 = 520
+ +                             ##     ##          => total_energy = 1437
+ +       341  -PP - - - -      ##     ##
+ +             PP              ##     ##
+ +       170  -## - - - -      ##     ##
+ +             ##     ##       ##     ##
+ +           ------------    -------------
+ +            CPU0   CPU1     CPU2   CPU3
+ +
+ +
+ +    From these calculations, the Case 1 has the lowest total energy. So CPU 1
+ +    is be the best candidate from an energy-efficiency standpoint.
+ +
+ +Big CPUs are generally more power hungry than the little ones and are thus used
+ +mainly when a task doesn't fit the littles. However, little CPUs aren't always
+ +necessarily more energy-efficient than big CPUs. For some systems, the high OPPs
+ +of the little CPUs can be less energy-efficient than the lowest OPPs of the
+ +bigs, for example. So, if the little CPUs happen to have enough utilization at
+ +a specific point in time, a small task waking up at that moment could be better
+ +of executing on the big side in order to save energy, even though it would fit
+ +on the little side.
+ +
+ +And even in the case where all OPPs of the big CPUs are less energy-efficient
+ +than those of the little, using the big CPUs for a small task might still, under
+ +specific conditions, save energy. Indeed, placing a task on a little CPU can
+ +result in raising the OPP of the entire performance domain, and that will
+ +increase the cost of the tasks already running there. If the waking task is
+ +placed on a big CPU, its own execution cost might be higher than if it was
+ +running on a little, but it won't impact the other tasks of the little CPUs
+ +which will keep running at a lower OPP. So, when considering the total energy
+ +consumed by CPUs, the extra cost of running that one task on a big core can be
+ +smaller than the cost of raising the OPP on the little CPUs for all the other
+ +tasks.
+ +
+ +The examples above would be nearly impossible to get right in a generic way, and
+ +for all platforms, without knowing the cost of running at different OPPs on all
+ +CPUs of the system. Thanks to its EM-based design, EAS should cope with them
+ +correctly without too many troubles. However, in order to ensure a minimal
+ +impact on throughput for high-utilization scenarios, EAS also implements another
+ +mechanism called 'over-utilization'.
+ +
+ +
+ +5. Over-utilization
+ +-------------------
+ +
+ +From a general standpoint, the use-cases where EAS can help the most are those
+ +involving a light/medium CPU utilization. Whenever long CPU-bound tasks are
+ +being run, they will require all of the available CPU capacity, and there isn't
+ +much that can be done by the scheduler to save energy without severly harming
+ +throughput. In order to avoid hurting performance with EAS, CPUs are flagged as
+ +'over-utilized' as soon as they are used at more than 80% of their compute
+ +capacity. As long as no CPUs are over-utilized in a root domain, load balancing
+ +is disabled and EAS overridess the wake-up balancing code. EAS is likely to load
+ +the most energy efficient CPUs of the system more than the others if that can be
+ +done without harming throughput. So, the load-balancer is disabled to prevent
+ +it from breaking the energy-efficient task placement found by EAS. It is safe to
+ +do so when the system isn't overutilized since being below the 80% tipping point
+ +implies that:
+ +
+ +    a. there is some idle time on all CPUs, so the utilization signals used by
+ +       EAS are likely to accurately represent the 'size' of the various tasks
+ +       in the system;
+ +    b. all tasks should already be provided with enough CPU capacity,
+ +       regardless of their nice values;
+ +    c. since there is spare capacity all tasks must be blocking/sleeping
+ +       regularly and balancing at wake-up is sufficient.
+ +
+ +As soon as one CPU goes above the 80% tipping point, at least one of the three
+ +assumptions above becomes incorrect. In this scenario, the 'overutilized' flag
+ +is raised for the entire root domain, EAS is disabled, and the load-balancer is
+ +re-enabled. By doing so, the scheduler falls back onto load-based algorithms for
+ +wake-up and load balance under CPU-bound conditions. This provides a better
+ +respect of the nice values of tasks.
+ +
+ +Since the notion of overutilization largely relies on detecting whether or not
+ +there is some idle time in the system, the CPU capacity 'stolen' by higher
+ +(than CFS) scheduling classes (as well as IRQ) must be taken into account. As
+ +such, the detection of overutilization accounts for the capacity used not only
+ +by CFS tasks, but also by the other scheduling classes and IRQ.
+ +
+ +
+ +6. Dependencies and requirements for EAS
+ +----------------------------------------
+ +
+ +Energy Aware Scheduling depends on the CPUs of the system having specific
+ +hardware properties and on other features of the kernel being enabled. This
+ +section lists these dependencies and provides hints as to how they can be met.
+ +
+ +
+ +6.1 - Asymmetric CPU topology
+ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+ +
+ +
+ +As mentioned in the introduction, EAS is only supported on platforms with
+ +asymmetric CPU topologies for now. This requirement is checked at run-time by
+ +looking for the presence of the SD_ASYM_CPUCAPACITY flag when the scheduling
+ +domains are built.
+ +
+ +The flag is set/cleared automatically by the scheduler topology code whenever
+ +there are CPUs with different capacities in a root domain. The capacities of
+ +CPUs are provided by arch-specific code through the arch_scale_cpu_capacity()
+ +callback. As an example, arm and arm64 share an implementation of this callback
+ +which uses a combination of CPUFreq data and device-tree bindings to compute the
+ +capacity of CPUs (see drivers/base/arch_topology.c for more details).
+ +
+ +So, in order to use EAS on your platform your architecture must implement the
+ +arch_scale_cpu_capacity() callback, and some of the CPUs must have a lower
+ +capacity than others.
+ +
+ +Please note that EAS is not fundamentally incompatible with SMP, but no
+ +significant savings on SMP platforms have been observed yet. This restriction
+ +could be amended in the future if proven otherwise.
+ +
+ +
+ +6.2 - Energy Model presence
+ +^^^^^^^^^^^^^^^^^^^^^^^^^^^
+ +
+ +EAS uses the EM of a platform to estimate the impact of scheduling decisions on
+ +energy. So, your platform must provide power cost tables to the EM framework in
+ +order to make EAS start. To do so, please refer to documentation of the
++independent EM framework in Documentation/power/energy-model.rst.
+ +
+ +Please also note that the scheduling domains need to be re-built after the
+ +EM has been registered in order to start EAS.
+ +
+ +
+ +6.3 - Energy Model complexity
+ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+ +
+ +The task wake-up path is very latency-sensitive. When the EM of a platform is
+ +too complex (too many CPUs, too many performance domains, too many performance
+ +states, ...), the cost of using it in the wake-up path can become prohibitive.
+ +The energy-aware wake-up algorithm has a complexity of:
+ +
+ +      C = Nd * (Nc + Ns)
+ +
+ +with: Nd the number of performance domains; Nc the number of CPUs; and Ns the
+ +total number of OPPs (ex: for two perf. domains with 4 OPPs each, Ns = 8).
+ +
+ +A complexity check is performed at the root domain level, when scheduling
+ +domains are built. EAS will not start on a root domain if its C happens to be
+ +higher than the completely arbitrary EM_MAX_COMPLEXITY threshold (2048 at the
+ +time of writing).
+ +
+ +If you really want to use EAS but the complexity of your platform's Energy
+ +Model is too high to be used with a single root domain, you're left with only
+ +two possible options:
+ +
+ +    1. split your system into separate, smaller, root domains using exclusive
+ +       cpusets and enable EAS locally on each of them. This option has the
+ +       benefit to work out of the box but the drawback of preventing load
+ +       balance between root domains, which can result in an unbalanced system
+ +       overall;
+ +    2. submit patches to reduce the complexity of the EAS wake-up algorithm,
+ +       hence enabling it to cope with larger EMs in reasonable time.
+ +
+ +
+ +6.4 - Schedutil governor
+ +^^^^^^^^^^^^^^^^^^^^^^^^
+ +
+ +EAS tries to predict at which OPP will the CPUs be running in the close future
+ +in order to estimate their energy consumption. To do so, it is assumed that OPPs
+ +of CPUs follow their utilization.
+ +
+ +Although it is very difficult to provide hard guarantees regarding the accuracy
+ +of this assumption in practice (because the hardware might not do what it is
+ +told to do, for example), schedutil as opposed to other CPUFreq governors at
+ +least _requests_ frequencies calculated using the utilization signals.
+ +Consequently, the only sane governor to use together with EAS is schedutil,
+ +because it is the only one providing some degree of consistency between
+ +frequency requests and energy predictions.
+ +
+ +Using EAS with any other governor than schedutil is not supported.
+ +
+ +
+ +6.5 Scale-invariant utilization signals
+ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+ +
+ +In order to make accurate prediction across CPUs and for all performance
+ +states, EAS needs frequency-invariant and CPU-invariant PELT signals. These can
+ +be obtained using the architecture-defined arch_scale{cpu,freq}_capacity()
+ +callbacks.
+ +
+ +Using EAS on a platform that doesn't implement these two callbacks is not
+ +supported.
+ +
+ +
+ +6.6 Multithreading (SMT)
+ +^^^^^^^^^^^^^^^^^^^^^^^^
+ +
+ +EAS in its current form is SMT unaware and is not able to leverage
+ +multithreaded hardware to save energy. EAS considers threads as independent
+ +CPUs, which can actually be counter-productive for both performance and energy.
+ +
+ +EAS on SMT is not supported.
diff --cc Documentation/translations/zh_CN/process/submitting-drivers.rst
Simple merge
diff --cc MAINTAINERS
Simple merge
diff --cc arch/arm64/kernel/pci.c
Simple merge
diff --cc arch/x86/Kconfig
Simple merge
diff --cc drivers/acpi/pci_root.c
Simple merge
diff --cc drivers/gpu/drm/i915/intel_runtime_pm.h

index 473c4850c01d0c6dacdaff180c0f1bf4077652b4,0000000000000000000000000000000000000000..2ee8f9522e0570cb0eeb2bfb35192fc3725b33a0

mode 100644,000000..100644
--- 1/drivers/gpu/drm/i915/intel_runtime_pm.h
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.h
@@@ -1,213 -1,0 +1,213 @@@
-  * For more, read the Documentation/power/runtime_pm.txt.
+ +/* SPDX-License-Identifier: MIT */
+ +/*
+ + * Copyright © 2019 Intel Corporation
+ + */
+ +
+ +#ifndef __INTEL_RUNTIME_PM_H__
+ +#define __INTEL_RUNTIME_PM_H__
+ +
+ +#include <linux/types.h>
+ +
+ +#include "display/intel_display.h"
+ +
+ +#include "intel_wakeref.h"
+ +
+ +#include "i915_utils.h"
+ +
+ +struct device;
+ +struct drm_i915_private;
+ +struct drm_printer;
+ +
+ +enum i915_drm_suspend_mode {
+ +      I915_DRM_SUSPEND_IDLE,
+ +      I915_DRM_SUSPEND_MEM,
+ +      I915_DRM_SUSPEND_HIBERNATE,
+ +};
+ +
+ +/*
+ + * This struct helps tracking the state needed for runtime PM, which puts the
+ + * device in PCI D3 state. Notice that when this happens, nothing on the
+ + * graphics device works, even register access, so we don't get interrupts nor
+ + * anything else.
+ + *
+ + * Every piece of our code that needs to actually touch the hardware needs to
+ + * either call intel_runtime_pm_get or call intel_display_power_get with the
+ + * appropriate power domain.
+ + *
+ + * Our driver uses the autosuspend delay feature, which means we'll only really
+ + * suspend if we stay with zero refcount for a certain amount of time. The
+ + * default value is currently very conservative (see intel_runtime_pm_enable), but
+ + * it can be changed with the standard runtime PM files from sysfs.
+ + *
+ + * The irqs_disabled variable becomes true exactly after we disable the IRQs and
+ + * goes back to false exactly before we reenable the IRQs. We use this variable
+ + * to check if someone is trying to enable/disable IRQs while they're supposed
+ + * to be disabled. This shouldn't happen and we'll print some error messages in
+ + * case it happens.
+ + *
++ * For more, read the Documentation/power/runtime_pm.rst.
+ + */
+ +struct intel_runtime_pm {
+ +      atomic_t wakeref_count;
+ +      struct device *kdev; /* points to i915->drm.pdev->dev */
+ +      bool available;
+ +      bool suspended;
+ +      bool irqs_enabled;
+ +
+ +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_RUNTIME_PM)
+ +      /*
+ +       * To aide detection of wakeref leaks and general misuse, we
+ +       * track all wakeref holders. With manual markup (i.e. returning
+ +       * a cookie to each rpm_get caller which they then supply to their
+ +       * paired rpm_put) we can remove corresponding pairs of and keep
+ +       * the array trimmed to active wakerefs.
+ +       */
+ +      struct intel_runtime_pm_debug {
+ +              spinlock_t lock;
+ +
+ +              depot_stack_handle_t last_acquire;
+ +              depot_stack_handle_t last_release;
+ +
+ +              depot_stack_handle_t *owners;
+ +              unsigned long count;
+ +      } debug;
+ +#endif
+ +};
+ +
+ +#define BITS_PER_WAKEREF      \
+ +      BITS_PER_TYPE(struct_member(struct intel_runtime_pm, wakeref_count))
+ +#define INTEL_RPM_WAKELOCK_SHIFT      (BITS_PER_WAKEREF / 2)
+ +#define INTEL_RPM_WAKELOCK_BIAS               (1 << INTEL_RPM_WAKELOCK_SHIFT)
+ +#define INTEL_RPM_RAW_WAKEREF_MASK    (INTEL_RPM_WAKELOCK_BIAS - 1)
+ +
+ +static inline int
+ +intel_rpm_raw_wakeref_count(int wakeref_count)
+ +{
+ +      return wakeref_count & INTEL_RPM_RAW_WAKEREF_MASK;
+ +}
+ +
+ +static inline int
+ +intel_rpm_wakelock_count(int wakeref_count)
+ +{
+ +      return wakeref_count >> INTEL_RPM_WAKELOCK_SHIFT;
+ +}
+ +
+ +static inline void
+ +assert_rpm_device_not_suspended(struct intel_runtime_pm *rpm)
+ +{
+ +      WARN_ONCE(rpm->suspended,
+ +                "Device suspended during HW access\n");
+ +}
+ +
+ +static inline void
+ +__assert_rpm_raw_wakeref_held(struct intel_runtime_pm *rpm, int wakeref_count)
+ +{
+ +      assert_rpm_device_not_suspended(rpm);
+ +      WARN_ONCE(!intel_rpm_raw_wakeref_count(wakeref_count),
+ +                "RPM raw-wakeref not held\n");
+ +}
+ +
+ +static inline void
+ +__assert_rpm_wakelock_held(struct intel_runtime_pm *rpm, int wakeref_count)
+ +{
+ +      __assert_rpm_raw_wakeref_held(rpm, wakeref_count);
+ +      WARN_ONCE(!intel_rpm_wakelock_count(wakeref_count),
+ +                "RPM wakelock ref not held during HW access\n");
+ +}
+ +
+ +static inline void
+ +assert_rpm_raw_wakeref_held(struct intel_runtime_pm *rpm)
+ +{
+ +      __assert_rpm_raw_wakeref_held(rpm, atomic_read(&rpm->wakeref_count));
+ +}
+ +
+ +static inline void
+ +assert_rpm_wakelock_held(struct intel_runtime_pm *rpm)
+ +{
+ +      __assert_rpm_wakelock_held(rpm, atomic_read(&rpm->wakeref_count));
+ +}
+ +
+ +/**
+ + * disable_rpm_wakeref_asserts - disable the RPM assert checks
+ + * @rpm: the intel_runtime_pm structure
+ + *
+ + * This function disable asserts that check if we hold an RPM wakelock
+ + * reference, while keeping the device-not-suspended checks still enabled.
+ + * It's meant to be used only in special circumstances where our rule about
+ + * the wakelock refcount wrt. the device power state doesn't hold. According
+ + * to this rule at any point where we access the HW or want to keep the HW in
+ + * an active state we must hold an RPM wakelock reference acquired via one of
+ + * the intel_runtime_pm_get() helpers. Currently there are a few special spots
+ + * where this rule doesn't hold: the IRQ and suspend/resume handlers, the
+ + * forcewake release timer, and the GPU RPS and hangcheck works. All other
+ + * users should avoid using this function.
+ + *
+ + * Any calls to this function must have a symmetric call to
+ + * enable_rpm_wakeref_asserts().
+ + */
+ +static inline void
+ +disable_rpm_wakeref_asserts(struct intel_runtime_pm *rpm)
+ +{
+ +      atomic_add(INTEL_RPM_WAKELOCK_BIAS + 1,
+ +                 &rpm->wakeref_count);
+ +}
+ +
+ +/**
+ + * enable_rpm_wakeref_asserts - re-enable the RPM assert checks
+ + * @rpm: the intel_runtime_pm structure
+ + *
+ + * This function re-enables the RPM assert checks after disabling them with
+ + * disable_rpm_wakeref_asserts. It's meant to be used only in special
+ + * circumstances otherwise its use should be avoided.
+ + *
+ + * Any calls to this function must have a symmetric call to
+ + * disable_rpm_wakeref_asserts().
+ + */
+ +static inline void
+ +enable_rpm_wakeref_asserts(struct intel_runtime_pm *rpm)
+ +{
+ +      atomic_sub(INTEL_RPM_WAKELOCK_BIAS + 1,
+ +                 &rpm->wakeref_count);
+ +}
+ +
+ +void intel_runtime_pm_init_early(struct intel_runtime_pm *rpm);
+ +void intel_runtime_pm_enable(struct intel_runtime_pm *rpm);
+ +void intel_runtime_pm_disable(struct intel_runtime_pm *rpm);
+ +void intel_runtime_pm_cleanup(struct intel_runtime_pm *rpm);
+ +
+ +intel_wakeref_t intel_runtime_pm_get(struct intel_runtime_pm *rpm);
+ +intel_wakeref_t intel_runtime_pm_get_if_in_use(struct intel_runtime_pm *rpm);
+ +intel_wakeref_t intel_runtime_pm_get_noresume(struct intel_runtime_pm *rpm);
+ +intel_wakeref_t intel_runtime_pm_get_raw(struct intel_runtime_pm *rpm);
+ +
+ +#define with_intel_runtime_pm(rpm, wf) \
+ +      for ((wf) = intel_runtime_pm_get(rpm); (wf); \
+ +           intel_runtime_pm_put((rpm), (wf)), (wf) = 0)
+ +
+ +#define with_intel_runtime_pm_if_in_use(rpm, wf) \
+ +      for ((wf) = intel_runtime_pm_get_if_in_use(rpm); (wf); \
+ +           intel_runtime_pm_put((rpm), (wf)), (wf) = 0)
+ +
+ +void intel_runtime_pm_put_unchecked(struct intel_runtime_pm *rpm);
+ +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_RUNTIME_PM)
+ +void intel_runtime_pm_put(struct intel_runtime_pm *rpm, intel_wakeref_t wref);
+ +#else
+ +static inline void
+ +intel_runtime_pm_put(struct intel_runtime_pm *rpm, intel_wakeref_t wref)
+ +{
+ +      intel_runtime_pm_put_unchecked(rpm);
+ +}
+ +#endif
+ +void intel_runtime_pm_put_raw(struct intel_runtime_pm *rpm, intel_wakeref_t wref);
+ +
+ +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_RUNTIME_PM)
+ +void print_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm,
+ +                                  struct drm_printer *p);
+ +#else
+ +static inline void print_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm,
+ +                                                struct drm_printer *p)
+ +{
+ +}
+ +#endif
+ +
+ +#endif /* __INTEL_RUNTIME_PM_H__ */
diff --cc drivers/opp/Kconfig
Simple merge
diff --cc drivers/pci/p2pdma.c
Simple merge
diff --cc drivers/pci/pci-driver.c
Simple merge
diff --cc drivers/pci/pci.c
Simple merge
diff --cc drivers/pci/pci.h
Simple merge
diff --cc drivers/pci/probe.c
Simple merge
diff --cc drivers/power/supply/power_supply_core.c
Simple merge
diff --cc drivers/soc/tegra/pmc.c
Simple merge
diff --cc include/linux/mod_devicetable.h
Simple merge
diff --cc include/linux/pci.h
Simple merge
diff --cc include/linux/pm.h
Simple merge
diff --cc kernel/power/Kconfig
Simple merge
diff --cc net/wireless/Kconfig
Simple merge
diff --cc tools/pci/pcitest.c
Simple merge
author	Linus Torvalds <torvalds@linux-foundation.org>
	Tue, 16 Jul 2019 03:44:49 +0000 (20:44 -0700)
committer	Linus Torvalds <torvalds@linux-foundation.org>
	Tue, 16 Jul 2019 03:44:49 +0000 (20:44 -0700)
		1	2
Documentation/ABI/testing/sysfs-class-powercap	patch \|	diff1 \|	diff2 \|	blob \| history
Documentation/admin-guide/kernel-parameters.txt	patch \|	diff1 \|	diff2 \|	blob \| history
Documentation/index.rst	patch \|	diff1 \|	diff2 \|	blob \| history
Documentation/scheduler/sched-energy.rst	patch \|	diff1 \|	\|	blob \| history
Documentation/translations/zh_CN/process/submitting-drivers.rst	patch \|	diff1 \|	diff2 \|	blob \| history
MAINTAINERS	patch \|	diff1 \|	diff2 \|	blob \| history
arch/arm64/kernel/pci.c	patch \|	diff1 \|	diff2 \|	blob \| history
arch/x86/Kconfig	patch \|	diff1 \|	diff2 \|	blob \| history
drivers/acpi/pci_root.c	patch \|	diff1 \|	diff2 \|	blob \| history
drivers/gpu/drm/i915/intel_runtime_pm.h	patch \|	diff1 \|	\|	blob \| history
drivers/opp/Kconfig	patch \|	diff1 \|	diff2 \|	blob \| history
drivers/pci/p2pdma.c	patch \|	diff1 \|	diff2 \|	blob \| history
drivers/pci/pci-driver.c	patch \|	diff1 \|	diff2 \|	blob \| history
drivers/pci/pci.c	patch \|	diff1 \|	diff2 \|	blob \| history
drivers/pci/pci.h	patch \|	diff1 \|	diff2 \|	blob \| history
drivers/pci/probe.c	patch \|	diff1 \|	diff2 \|	blob \| history
drivers/power/supply/power_supply_core.c	patch \|	diff1 \|	diff2 \|	blob \| history
drivers/soc/tegra/pmc.c	patch \|	diff1 \|	diff2 \|	blob \| history
include/linux/mod_devicetable.h	patch \|	diff1 \|	diff2 \|	blob \| history
include/linux/pci.h	patch \|	diff1 \|	diff2 \|	blob \| history
include/linux/pm.h	patch \|	diff1 \|	diff2 \|	blob \| history
kernel/power/Kconfig	patch \|	diff1 \|	diff2 \|	blob \| history
net/wireless/Kconfig	patch \|	diff1 \|	diff2 \|	blob \| history
tools/pci/pcitest.c	patch \|	diff1 \|	diff2 \|	blob \| history