Merge "docs(psci): expound runtime instrumentation docs" into integration

This commit is contained in:
Manish Pandey 2023-05-11 13:41:35 +02:00 committed by TrustedFirmware Code Review
commit dcf430656c
3 changed files with 175 additions and 1 deletions

View File

@ -5,10 +5,12 @@ Performance & Testing
:maxdepth: 1
:caption: Contents
psci-performance-instr
psci-performance-juno
psci-performance-methodology
tsp
performance-monitoring-unit
--------------
*Copyright (c) 2019-2020, Arm Limited. All rights reserved.*
*Copyright (c) 2019-2023, Arm Limited. All rights reserved.*

View File

@ -0,0 +1,117 @@
PSCI Performance Measurement
============================
TF-A provides two instrumentation tools for performing analysis of the PSCI
implementation:
* PSCI STAT
* Runtime Instrumentation
This page explains how they may be enabled and used to perform all varieties of
analysis.
Performance Measurement Framework
---------------------------------
The Performance Measurement Framework `PMF`_ is a framework that provides
mechanisms for collecting and retrieving timestamps at runtime from the
Performance Measurement Unit (`PMU`_). The PMU is a generalized abstraction for
accessing CPU hardware registers used to measure hardware events. This means,
for instance, that the PMU might be used to place instrumentation points at
logical locations in code for tracing purposes.
TF-A utilises the PMF as a backend for the two instrumentation services it
provides--PSCI Statistics and Runtime Instrumentation. The PMF is used by
these services to facilitate collection and retrieval of timestamps. For
instance, the PSCI Statistics service registers the PMF service
``psci_svc`` to track its residency statistics.
This is reserved a unique ID, name, and space in memory by the PMF. The
framework provides a convenient interface for PSCI Statistics to retrieve
values from ``psci_svc`` at runtime. Alternatively, the service may be
configured such that the PMF dumps those values to the console. A platform may
choose to expose SMCs that allow retrieval of these timestamps from the
service.
This feature is enabled with the Boolean flag ``ENABLE_PMF``.
PSCI Statistics
---------------
PSCI Statistics is a runtime service that provides residency statistics for
power states used by the platform. The service tracks residency time and
entry count. Residency time is the total time spent in a particular power
state by a PE. The entry count is the number of times the PE has entered
the power state. PSCI Statistics implements the optional functions
``PSCI_STAT_RESIDENCY`` and ``PSCI_STAT_COUNT`` from the `PSCI`_
specification.
.. c:macro:: PSCI_STAT_RESIDENCY
:param target_cpu: Contains copy of affinity fields in the MPIDR register
for identifying the target core (See section 5.1.4 of `PSCI`_
specifications for more details).
:param power_state: identifier for a specific local
state. Generally, this parameter takes the same form as the power_state
parameter described for CPU_SUSPEND in section 5.4.2.
:returns: Time spent in ``power_state``, in microseconds, by ``target_cpu``
and the highest level expressed in ``power_state``.
.. c:macro:: PSCI_STAT_COUNT
:param target_cpu: follows the same format as ``PSCI_STAT_RESIDENCY``.
:param power_state: follows the same format as ``PSCI_STAT_RESIDENCY``.
:returns: Number of times the state expressed in ``power_state`` has been
used by ``target_cpu`` and the highest level expressed in
``power_state``.
The implementation provides residency statistics only for low power states,
and does this regardless of the entry mechanism into those states. The
statistics it collects are set to 0 during shutdown or reset.
PSCI Statistics is enabled with the Boolean build flag
``ENABLE_PSCI_STAT``. All Arm platforms utilise the PMF unless another
collection backend is provided (``ENABLE_PMF`` is implicitly enabled).
Runtime Instrumentation
-----------------------
The Runtime Instrumentation Service is an instrumentation tool that wraps
around the PMF to provide timestamp data. Although the service is not
restricted to PSCI, it is used primarily in TF-A to quantify the total time
spent in the PSCI implementation. The tool can be used to instrument other
components in TF-A as well. It is enabled with the Boolean flag
``ENABLE_RUNTIME_INSTRUMENTATION``, and as with PSCI STAT, requires PMF to
be enabled.
In PSCI, this service provides instrumentation points in the
following code paths:
* Entry into the PSCI SMC handler
* Exit from the PSCI SMC handler
* Entry to low power state
* Exit from low power state
* Entry into cache maintenance operations in PSCI
* Exit from cache maintenance operations in PSCI
The service captures the cycle count, which allows for the time spent in the
implementation to be calculated, given the frequency counter.
PSCI SMC Handler Instrumentation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The timestamp during entry into the handler is captured as early as possible
during the runtime exception, prior to entry into the handler itself. All
timestamps are stored in memory for later retrieval. The exit timestamp is
captured after normal return from the PSCI SMC handler, or, if a low power state
was requested, it is captured in the warm boot path.
*Copyright (c) 2023, Arm Limited. All rights reserved.*
.. _PMF: ../design/firmware-design.html#performance-measurement-framework
.. _PMU: performance-monitoring-unit.html
.. _PSCI: https://developer.arm.com/documentation/den0022/latest/

View File

@ -0,0 +1,55 @@
Runtime Instrumentation Methodology
===================================
This document outlines steps for undertaking performance measurements of key
operations in the Trusted Firmware-A Power State Coordination Interface (PSCI)
implementation, using the in-built Performance Measurement Framework (PMF) and
runtime instrumentation timestamps.
Framework
~~~~~~~~~
The tests are based on the ``runtime-instrumentation`` test suite provided by
the Trusted Firmware Test Framework (TFTF). The release build of this framework
was used because the results in the debug build became skewed; the console
output prevented some of the tests from executing in parallel.
The tests consist of both parallel and sequential tests, which are broadly
described as follows:
- **Parallel Tests** This type of test powers on all the non-lead CPUs and
brings them and the lead CPU to a common synchronization point. The lead CPU
then initiates the test on all CPUs in parallel.
- **Sequential Tests** This type of test powers on each non-lead CPU in
sequence. The lead CPU initiates the test on a non-lead CPU then waits for the
test to complete before proceeding to the next non-lead CPU. The lead CPU then
executes the test on itself.
Note there is very little variance observed in the values given (~1us), although
the values for each CPU are sometimes interchanged, depending on the order in
which locks are acquired. Also, there is very little variance observed between
executing the tests sequentially in a single boot or rebooting between tests.
Given that runtime instrumentation using PMF is invasive, there is a small
(unquantified) overhead on the results. PMF uses the generic counter for
timestamps, which runs at 50MHz on Juno.
Metrics
~~~~~~~
.. glossary::
Powerdown Latency
Time taken from entering the TF PSCI implementation to the point the hardware
enters the low power state (WFI). Referring to the TF runtime instrumentation points, this
corresponds to: ``(RT_INSTR_ENTER_HW_LOW_PWR - RT_INSTR_ENTER_PSCI)``.
Wakeup Latency
Time taken from the point the hardware exits the low power state to exiting
the TF PSCI implementation. This corresponds to: ``(RT_INSTR_EXIT_PSCI -
RT_INSTR_EXIT_HW_LOW_PWR)``.
Cache Flush Latency
Time taken to flush the caches during powerdown. This corresponds to:
``(RT_INSTR_EXIT_CFLUSH - RT_INSTR_ENTER_CFLUSH)``.