Expands the caching documentation to include a high-level overview of caching strategies and the features available in Zephyr. Signed-off-by: Dane Wagner <dane.wagner@gmail.com>
183 lines
7.7 KiB
ReStructuredText
183 lines
7.7 KiB
ReStructuredText
.. _cache_guide:
|
|
|
|
Caching Basics
|
|
##############
|
|
|
|
This section discusses the basics of cache coherency and under what situations a
|
|
user needs to explicitly deal with caching. For more detailed info on Zephyr's
|
|
caching tools, see :ref:`cache_config` for Zephyr Kconfig options or
|
|
:ref:`cache_api` for the API reference. This section primarily focuses on the
|
|
data cache though there is typically also an instruction cache for systems with
|
|
cache support.
|
|
|
|
.. note::
|
|
|
|
The information here assumes that the architecture-specific MPU support is
|
|
enabled. See the architecture-specific documentation for details.
|
|
|
|
.. note::
|
|
|
|
While cache coherence can be a concern for data shared between SMP cores, Zephyr
|
|
in general ensures that memory will be seen in a coherent state from multiple
|
|
cores. Most applications will only need to use the cache APIs for interaction
|
|
with external hardware like DMA controllers or foreign CPUs running a
|
|
different OS image. For more information on cache coherence between SMP cores,
|
|
see :kconfig:option:`CONFIG_KERNEL_COHERENCE`.
|
|
|
|
When dealing with memory shared between a processor core and other bus masters,
|
|
cache coherency needs to be considered. Typically processor caches exist as
|
|
close to each processor core as possible to maximize performance gain. Because
|
|
of this, data moved into and out of memory by DMA engines will be stale in the
|
|
processor's cache, resulting in what appears to be corrupt data. If you are
|
|
moving data using DMA and the processor doesn't see the data you expect, cache
|
|
coherency may be the issue.
|
|
|
|
There are multiple approaches to ensuring that the data seen by the processor
|
|
core and peripherals is coherent. The simplest is just to disable caching, but
|
|
this defeats the purpose of having a hardware cache in the first place and
|
|
results in a significant performance hit. Many architectures provide methods for
|
|
disabling caching for only a portion of memory. This can be useful when cache
|
|
coherence is more important than performance, such as when using DMA with SPI.
|
|
Finally, there is the option to flush or invalidate the cache for regions of
|
|
memory at runtime.
|
|
|
|
Globally Disabling the Data Cache
|
|
---------------------------------
|
|
|
|
As mentioned above, globally disabling data caching can have a significant
|
|
performance impact but can be useful for debugging.
|
|
|
|
Requirements:
|
|
|
|
* :kconfig:option:`CONFIG_DCACHE`: DCACHE control enabled in Zephyr.
|
|
|
|
* :kconfig:option:`CONFIG_CACHE_MANAGEMENT`: cache API enabled.
|
|
|
|
* Call :c:func:`sys_cache_data_disable()` to globally disable the data cache.
|
|
|
|
Disabling Caching for a Memory Region
|
|
-------------------------------------
|
|
|
|
Disabling caching for only a portion of memory can be a good performance
|
|
compromise if performance on the uncached memory is not critical to the
|
|
application. This is a good option if the application requires many small
|
|
unrelated buffers that are smaller than a cache line.
|
|
|
|
Requirements:
|
|
|
|
* :kconfig:option:`CONFIG_DCACHE`: DCACHE control enabled in Zephyr.
|
|
|
|
* :kconfig:option:`CONFIG_MEM_ATTR`: enable the ``mem-attr`` library for
|
|
handling memory attributes in the device tree.
|
|
|
|
* Annotate your device tree according to :ref:`mem_mgmt_api`.
|
|
|
|
Assuming the MPU driver is enabled, it will configure the specified regions
|
|
according to the memory attributes specified during kernel initialization. When
|
|
using a dedicated uncached region of memory, the linker needs to be instructed
|
|
to place buffers into that region. This can be accomplished by specifying the
|
|
memory region explicitly using ``Z_GENERIC_SECTION``:
|
|
|
|
.. code-block:: c
|
|
|
|
/* SRAM4 marked as uncached in device tree */
|
|
uint8_t buffer[BUF_SIZE] Z_GENERIC_SECTION("SRAM4");
|
|
|
|
.. note::
|
|
|
|
Configuring a distinct memory region with separate caching rules requires the
|
|
use of an MPU region which may be a limited resource on some architectures.
|
|
MPU regions may be needed by other memory protection features such as
|
|
:ref:`userspace <mpu_userspace>`, :ref:`stack protection <mpu_stack_objects>`,
|
|
or :ref:`memory domains<memory_domain>`.
|
|
|
|
Automatically Disabling Caching by Variable
|
|
-------------------------------------------
|
|
|
|
Zephyr has the ability to automatically define an uncached region in memory and
|
|
allocate variables to it using ``__nocache``. Any variables marked with this
|
|
attribute will be placed in a special ``nocache`` linker region in memory. This
|
|
region will be configured as uncached by the MPU driver during initialization.
|
|
This is a simpler option than explicitly declaring a region of memory uncached
|
|
but provides less control over the placement of these variables, as the linker
|
|
may allocate this region anywhere in RAM.
|
|
|
|
Requirements:
|
|
|
|
* :kconfig:option:`CONFIG_DCACHE`: DCACHE control enabled in Zephyr.
|
|
|
|
* :kconfig:option:`CONFIG_NOCACHE_MEMORY`: enable allocation of the ``nocache``
|
|
linker region and configure it as uncached.
|
|
|
|
* Add the ``__nocache`` attribute at the end of any uncached buffer definition:
|
|
|
|
.. code-block:: c
|
|
|
|
uint8_t buffer[BUF_SIZE] __nocache;
|
|
|
|
.. note::
|
|
|
|
See note above regarding possible limitations on MPU regions. The ``nocache``
|
|
region is still a distinct MPU region even though it is automatically created
|
|
by Zephyr instead of being explicitly defined by the user.
|
|
|
|
Runtime Cache Control
|
|
---------------------
|
|
|
|
The most performant but most complex option is to control data caching at
|
|
runtime. The two most relevant cache operations in this case are **flushing**
|
|
and **invalidating**. Both of these operations operate on the smallest unit of
|
|
cacheable memory, the cache line. Data cache lines are typically 16 to 128
|
|
bytes. See :kconfig:option:`CONFIG_DCACHE_LINE_SIZE`. Cache line sizes are
|
|
typically fixed in hardware and not configurable, but Zephyr does need to know
|
|
the size of cache lines in order to correctly and efficiently manage the cache.
|
|
If the buffers in question are smaller than the data cache line size, it may be
|
|
more efficient to place them in an uncached region, as unrelated data packed
|
|
into the same cache line may be destroyed when invalidating.
|
|
|
|
Flushing the cache involves writing all modified cache lines in a specified
|
|
region back to shared memory. Flush the cache associated with a buffer after the
|
|
processor has written to it and before a remote bus master reads from that
|
|
region.
|
|
|
|
.. note::
|
|
|
|
Some architectures support a cache configuration called **write-through**
|
|
caching in which data writes from the processor core propagate through to
|
|
shared memory. While this solves the cache coherence problem for CPU writes,
|
|
it also results in more traffic to main memory which may result in performance
|
|
degradation.
|
|
|
|
Invalidating the cache works similarly but in the other direction. It marks
|
|
cache lines in the specified region as stale, ensuring that the cache line will
|
|
be refreshed from main memory when the processor next reads from the specified
|
|
region. Invalidate the data cache of a buffer that a peripheral has written to
|
|
before reading from that region.
|
|
|
|
In some cases, the same buffer may be reused for e.g. DMA reads and DMA writes.
|
|
In that case it is possible to first flush the cache associated with a buffer
|
|
and then invalidate it, ensuring that the cache will be refreshed the next time
|
|
the processor reads from the buffer.
|
|
|
|
Requirements:
|
|
|
|
* :kconfig:option:`CONFIG_DCACHE`: DCACHE control enabled in Zephyr.
|
|
|
|
* :kconfig:option:`CONFIG_CACHE_MANAGEMENT`: cache API enabled.
|
|
|
|
* Call :c:func:`sys_cache_data_flush_range()` to flush a memory region.
|
|
|
|
* Call :c:func:`sys_cache_data_invd_range()` to invalidate a memory region.
|
|
|
|
* Call :c:func:`sys_cache_data_flush_and_invd_range()` to flush and invalidate.
|
|
|
|
Alignment
|
|
---------
|
|
|
|
As mentioned in :c:func:`sys_cache_data_invd_range()` and associated functions,
|
|
buffers should be aligned to the cache line size. This can be accomplished by
|
|
using ``__aligned``:
|
|
|
|
.. code-block:: c
|
|
|
|
uint8_t buffer[BUF_SIZE] __aligned(CONFIG_DCACHE_LINE_SIZE);
|