friendlyelec

History

hmz007 6d24f2138b Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56) Signed-off-by: hmz007 <hmz007@gmail.com> Change-Id: I784027322f8f58ccd3bca1cd9970f2f7580edee7		3 years ago
..
CMakeLists.txt	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
JSON.cpp	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
JSON.h	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
JSONTest.cpp	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
LibcBenchmark.cpp	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
LibcBenchmark.h	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
LibcBenchmarkTest.cpp	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
LibcMemoryBenchmark.cpp	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
LibcMemoryBenchmark.h	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
LibcMemoryBenchmarkMain.cpp	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
LibcMemoryBenchmarkMain.h	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
LibcMemoryBenchmarkTest.cpp	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
Memcmp.cpp	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
Memcpy.cpp	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
MemorySizeDistributions.cpp	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
MemorySizeDistributions.h	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
Memset.cpp	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
RATIONALE.md	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
README.md	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
configuration_big.json	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
configuration_small.json	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago
render.py3	Rockchip Anroid12_SDK 20220721-rkr10 (e1522e56)	3 years ago

README.md

Libc mem* benchmarks

This framework has been designed to evaluate and compare relative performance of memory function implementations on a particular host.

It will also be use to track implementations performances over time.

Quick start

Setup

Python 2 being deprecated it is advised to used Python 3.

Then make sure to have matplotlib, scipy and numpy setup correctly:

apt-get install python3-pip
pip3 install matplotlib scipy numpy

You may need python3-gtk or similar package for displaying benchmark results.

To get good reproducibility it is important to make sure that the system runs in performance mode. This is achieved by running:

cpupower frequency-set --governor performance

Run and display `memcpy` benchmark

The following commands will run the benchmark and display a 95 percentile confidence interval curve of time per copied bytes. It also features host informations and benchmarking configuration.

cd llvm-project
cmake -B/tmp/build -Sllvm -DLLVM_ENABLE_PROJECTS='clang;clang-tools-extra;libc' -DCMAKE_BUILD_TYPE=Release -G Ninja
ninja -C /tmp/build display-libc-memcpy-benchmark-small

The display target will attempt to open a window on the machine where you're running the benchmark. If this may not work for you then you may want render or run instead as detailed below.

Benchmarking targets

The benchmarking process occurs in two steps:

Benchmark the functions and produce a json file
Display (or renders) the json file

Targets are of the form <action>-libc-<function>-benchmark-<configuration>

action is one of :
- run, runs the benchmark and writes the json file
- display, displays the graph on screen
- render, renders the graph on disk as a png file
function is one of : memcpy, memcmp, memset
configuration is one of : small, big

Benchmarking regimes

Using a profiler to observe size distributions for calls into libc functions, it was found most operations act on a small number of bytes.

Function	% of calls with size ≤ 128	% of calls with size ≤ 1024
memcpy	96%	99%
memset	91%	99.9%
memcmp¹	99.5%	~100%

Benchmarking configurations come in two flavors:

small
- Exercises sizes up to 1KiB, representative of normal usage
- The data is kept in the L1 cache to prevent measuring the memory subsystem
big
- Exercises sizes up to 32MiB to test large operations
- Caching effects can show up here which prevents comparing different hosts

¹ - The size refers to the size of the buffers to compare and not the number of bytes until the first difference.

Superposing curves

It is possible to merge several json files into a single graph. This is useful to compare implementations.

In the following example we superpose the curves for memcpy, memset and memcmp:

> make -C /tmp/build run-libc-memcpy-benchmark-small run-libc-memcmp-benchmark-small run-libc-memset-benchmark-small
> python libc/utils/benchmarks/render.py3 /tmp/last-libc-memcpy-benchmark-small.json /tmp/last-libc-memcmp-benchmark-small.json /tmp/last-libc-memset-benchmark-small.json

Useful `render.py3` flags

To save the produced graph --output=/tmp/benchmark_curve.png.
To prevent the graph from appearing on the screen --headless.

Under the hood

To learn more about the design decisions behind the benchmarking framework, have a look at the RATIONALE.md file.