You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
238 lines
10 KiB
238 lines
10 KiB
# Introduction
|
|
|
|
The ArmNN Delegate can be found within the ArmNN repository but it is a standalone piece of software. However,
|
|
it makes use of the ArmNN library. For this reason we have added two options to build the delegate. The first option
|
|
allows you to build the delegate together with the ArmNN library, the second option is a standalone build
|
|
of the delegate.
|
|
|
|
This tutorial uses an Aarch64 machine with Ubuntu 18.04 installed that can build all components
|
|
natively (no cross-compilation required). This is to keep this guide simple.
|
|
|
|
1. [Dependencies](#Dependencies)
|
|
* [Build Tensorflow for C++](#Build Tensorflow for C++)
|
|
* [Build Flatbuffers](#Build Flatbuffers)
|
|
* [Build the Arm Compute Library](#Build the Arm Compute Library)
|
|
* [Build the ArmNN Library](#Build the ArmNN Library)
|
|
2. [Build the TfLite Delegate (Stand-Alone)](#Build the TfLite Delegate (Stand-Alone))
|
|
3. [Build the Delegate together with ArmNN](#Build the Delegate together with ArmNN)
|
|
4. [Integrate the ArmNN TfLite Delegate into your project](#Integrate the ArmNN TfLite Delegate into your project)
|
|
|
|
# Dependencies
|
|
|
|
Build Dependencies:
|
|
* Tensorflow and Tensorflow Lite version 2.3.1
|
|
* Flatbuffers 1.12.0
|
|
* ArmNN 20.11 or higher
|
|
|
|
Required Tools:
|
|
* Git
|
|
* pip
|
|
* wget
|
|
* zip
|
|
* unzip
|
|
* cmake 3.7.0 or higher
|
|
* scons
|
|
* bazel 3.1.0
|
|
|
|
Our first step is to build all the build dependencies I have mentioned above. We will have to create quite a few
|
|
directories. To make navigation a bit easier define a base directory for the project. At this stage we can also
|
|
install all the tools that are required during the build.
|
|
```bash
|
|
export BASEDIR=/home
|
|
cd $BASEDIR
|
|
apt-get update && apt-get install git wget unzip zip python git cmake scons
|
|
```
|
|
|
|
## Build Tensorflow for C++
|
|
Tensorflow has a few dependencies on it's own. It requires the python packages pip3, numpy, wheel, keras_preprocessing
|
|
and also bazel which is used to compile Tensoflow. A description on how to build bazel can be
|
|
found [here](https://docs.bazel.build/versions/master/install-compile-source.html). There are multiple ways.
|
|
I decided to compile from source because that should work for any platform and therefore adds the most value
|
|
to this guide. Depending on your operating system and architecture there might be an easier way.
|
|
```bash
|
|
# Install the python packages
|
|
pip3 install -U pip numpy wheel
|
|
pip3 install -U keras_preprocessing --no-deps
|
|
|
|
# Bazel has a dependency on JDK
|
|
apt-get install openjdk-11-jdk
|
|
# Build Bazel
|
|
wget -O bazel-3.1.0-dist.zip https://github.com/bazelbuild/bazel/releases/download/3.1.0/bazel-3.1.0-dist.zip
|
|
unzip -d bazel bazel-3.1.0-dist.zip
|
|
cd bazel
|
|
env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh
|
|
# This creates an "output" directory where the bazel binary can be found
|
|
|
|
# Download Tensorflow
|
|
cd $BASEDIR
|
|
git clone https://github.com/tensorflow/tensorflow.git
|
|
cd tensorflow/
|
|
git checkout tags/v2.3.1 # Minimum version required for the delegate
|
|
```
|
|
Before tensorflow can be built, targets need to be defined in the `BUILD` file that can be
|
|
found in the root directory of Tensorflow. Append the following two targets to the file:
|
|
```
|
|
cc_binary(
|
|
name = "libtensorflow_all.so",
|
|
linkshared = 1,
|
|
deps = [
|
|
"//tensorflow/core:framework",
|
|
"//tensorflow/core:tensorflow",
|
|
"//tensorflow/cc:cc_ops",
|
|
"//tensorflow/cc:client_session",
|
|
"//tensorflow/cc:scope",
|
|
"//tensorflow/c:c_api",
|
|
],
|
|
)
|
|
cc_binary(
|
|
name = "libtensorflow_lite_all.so",
|
|
linkshared = 1,
|
|
deps = [
|
|
"//tensorflow/lite:framework",
|
|
"//tensorflow/lite/kernels:builtin_ops",
|
|
],
|
|
)
|
|
```
|
|
Now the build process can be started. When calling "configure", as below, a dialog shows up that asks the
|
|
user to specify additional options. If you don't have any particular needs to your build, decline all
|
|
additional options and choose default values. Building `libtensorflow_all.so` requires quite some time.
|
|
This might be a good time to get yourself another drink and take a break.
|
|
```bash
|
|
PATH="$BASEDIR/bazel/output:$PATH" ./configure
|
|
$BASEDIR/bazel/output/bazel build --define=grpc_no_ares=true --config=opt --config=monolithic --strip=always --config=noaws libtensorflow_all.so
|
|
$BASEDIR/bazel/output/bazel build --config=opt --config=monolithic --strip=always libtensorflow_lite_all.so
|
|
```
|
|
|
|
## Build Flatbuffers
|
|
|
|
Flatbuffers is a memory efficient cross-platform serialization library as
|
|
described [here](https://google.github.io/flatbuffers/). It is used in tflite to store models and is also a dependency
|
|
of the delegate. After downloading the right version it can be built and installed using cmake.
|
|
```bash
|
|
cd $BASEDIR
|
|
wget -O flatbuffers-1.12.0.zip https://github.com/google/flatbuffers/archive/v1.12.0.zip
|
|
unzip -d . flatbuffers-1.12.0.zip
|
|
cd flatbuffers-1.12.0
|
|
mkdir install && mkdir build && cd build
|
|
# I'm using a different install directory but that is not required
|
|
cmake .. -DCMAKE_INSTALL_PREFIX:PATH=$BASEDIR/flatbuffers-1.12.0/install
|
|
make install
|
|
```
|
|
|
|
## Build the Arm Compute Library
|
|
|
|
The ArmNN library depends on the Arm Compute Library (ACL). It provides a set of functions that are optimized for
|
|
both Arm CPUs and GPUs. The Arm Compute Library is used directly by ArmNN to run machine learning workloads on
|
|
Arm CPUs and GPUs.
|
|
|
|
It is important to have the right version of ACL and ArmNN to make it work. Luckily, ArmNN and ACL are developed
|
|
very closely and released together. If you would like to use the ArmNN version "20.11" you can use the same "20.11"
|
|
version for ACL too.
|
|
|
|
To build the Arm Compute Library on your platform, download the Arm Compute Library and check the branch
|
|
out that contains the version you want to use and build it using `scons`.
|
|
```bash
|
|
cd $BASEDIR
|
|
git clone https://review.mlplatform.org/ml/ComputeLibrary
|
|
cd ComputeLibrary/
|
|
git checkout <branch_name> # e.g. branches/arm_compute_20_11
|
|
# The machine used for this guide only has a Neon CPU which is why I only have "neon=1" but if
|
|
# your machine has an arm Gpu you can enable that by adding `opencl=1 embed_kernels=1 to the command below
|
|
scons arch=arm64-v8a neon=1 extra_cxx_flags="-fPIC" benchmark_tests=0 validation_tests=0
|
|
```
|
|
|
|
## Build the ArmNN Library
|
|
|
|
After building ACL we can now continue building ArmNN. To do so, download the repository and checkout the same
|
|
version as you did for ACL. Create a build directory and use cmake to build it.
|
|
```bash
|
|
cd $BASEDIR
|
|
git clone "https://review.mlplatform.org/ml/armnn"
|
|
cd armnn
|
|
git checkout <branch_name> # e.g. branches/armnn_20_11
|
|
mkdir build && cd build
|
|
# if you've got an arm Gpu add `-DARMCOMPUTECL=1` to the command below
|
|
cmake .. -DARMCOMPUTE_ROOT=$BASEDIR/ComputeLibrary -DARMCOMPUTENEON=1 -DBUILD_UNIT_TESTS=0
|
|
make
|
|
```
|
|
|
|
# Build the TfLite Delegate (Stand-Alone)
|
|
|
|
The delegate as well as ArmNN is built using cmake. Create a build directory as usual and build the Delegate
|
|
with the additional cmake arguments shown below
|
|
```bash
|
|
cd $BASEDIR/armnn/delegate && mkdir build && cd build
|
|
cmake .. -DTENSORFLOW_LIB_DIR=$BASEDIR/tensorflow/bazel-bin \ # Directory where tensorflow libraries can be found
|
|
-DTENSORFLOW_ROOT=$BASEDIR/tensorflow \ # The top directory of the tensorflow repository
|
|
-DTFLITE_LIB_ROOT=$BASEDIR/tensorflow/bazel-bin \ # In our case the same as TENSORFLOW_LIB_DIR
|
|
-DFLATBUFFERS_ROOT=$BASEDIR/flatbuffers-1.12.0/install \ # The install directory
|
|
-DArmnn_DIR=$BASEDIR/armnn/build \ # Directory where the ArmNN library can be found
|
|
-DARMNN_SOURCE_DIR=$BASEDIR/armnn # The top directory of the ArmNN repository.
|
|
# Required are the includes for ArmNN
|
|
make
|
|
```
|
|
|
|
To ensure that the build was successful you can run the unit tests for the delegate that can be found in
|
|
the build directory for the delegate. [Doctest](https://github.com/onqtam/doctest) was used to create those tests. Using test filters you can
|
|
filter out tests that your build is not configured for. In this case, because ArmNN was only built for Cpu
|
|
acceleration (CpuAcc), we filter for all test suites that have `CpuAcc` in their name.
|
|
```bash
|
|
cd $BASEDIR/armnn/delegate/build
|
|
./DelegateUnitTests --test-suite=*CpuAcc*
|
|
```
|
|
If you have built for Gpu acceleration as well you might want to change your test-suite filter:
|
|
```bash
|
|
./DelegateUnitTests --test-suite=*CpuAcc*,*GpuAcc*
|
|
```
|
|
|
|
|
|
# Build the Delegate together with ArmNN
|
|
|
|
In the introduction it was mentioned that there is a way to integrate the delegate build into ArmNN. This is
|
|
pretty straight forward. The cmake arguments that were previously used for the delegate have to be added
|
|
to the ArmNN cmake arguments. Also another argument `BUILD_ARMNN_TFLITE_DELEGATE` needs to be added to
|
|
instruct ArmNN to build the delegate as well. The new commands to build ArmNN are as follows:
|
|
```bash
|
|
cd $BASEDIR
|
|
git clone "https://review.mlplatform.org/ml/armnn"
|
|
cd armnn
|
|
git checkout <branch_name> # e.g. branches/armnn_20_11
|
|
mkdir build && cd build
|
|
# if you've got an arm Gpu add `-DARMCOMPUTECL=1` to the command below
|
|
cmake .. -DARMCOMPUTE_ROOT=$BASEDIR/ComputeLibrary \
|
|
-DARMCOMPUTENEON=1 \
|
|
-DBUILD_UNIT_TESTS=0 \
|
|
-DBUILD_ARMNN_TFLITE_DELEGATE=1 \
|
|
-DTENSORFLOW_LIB_DIR=$BASEDIR/tensorflow/bazel-bin \
|
|
-DTENSORFLOW_ROOT=$BASEDIR/tensorflow \
|
|
-DTFLITE_LIB_ROOT=$BASEDIR/tensorflow/bazel-bin \
|
|
-DFLATBUFFERS_ROOT=$BASEDIR/flatbuffers-1.12.0/install
|
|
make
|
|
```
|
|
The delegate library can then be found in `build/armnn/delegate`.
|
|
|
|
|
|
# Integrate the ArmNN TfLite Delegate into your project
|
|
|
|
The delegate can be integrated into your c++ project by creating a TfLite Interpreter and
|
|
instructing it to use the ArmNN delegate for the graph execution. This should look similiar
|
|
to the following code snippet.
|
|
```objectivec
|
|
// Create TfLite Interpreter
|
|
std::unique_ptr<Interpreter> armnnDelegateInterpreter;
|
|
InterpreterBuilder(tfLiteModel, ::tflite::ops::builtin::BuiltinOpResolver())
|
|
(&armnnDelegateInterpreter)
|
|
|
|
// Create the ArmNN Delegate
|
|
armnnDelegate::DelegateOptions delegateOptions(backends);
|
|
std::unique_ptr<TfLiteDelegate, decltype(&armnnDelegate::TfLiteArmnnDelegateDelete)>
|
|
theArmnnDelegate(armnnDelegate::TfLiteArmnnDelegateCreate(delegateOptions),
|
|
armnnDelegate::TfLiteArmnnDelegateDelete);
|
|
|
|
// Instruct the Interpreter to use the armnnDelegate
|
|
armnnDelegateInterpreter->ModifyGraphWithDelegate(theArmnnDelegate.get());
|
|
```
|
|
For further information on using TfLite Delegates
|
|
please visit the [tensorflow website](https://www.tensorflow.org/lite/guide)
|
|
|