ARM-on-ARM: Leveraging Virtualization Extensions for Fast Virtual Platforms
- Jünger, L. , Bölke, J. , Tobies, S. , Hoffmann, A. , Leupers, R.
- Book Title:
- Proceedings of the Conference on Design, Automation & Test in Europe (DATE)
AbstractVirtual Platforms (VPs) are an essential enabling
technology in the System-on-a-Chip (SoC) development cycle.
They are used for early software development and hardware/software codesign. However, since virtual prototyping is limited by
simulation performance, improving the simulation speed of VPs
has been an active research topic for years. Different strategies
have been proposed, such as fast instruction set simulation using
Dynamic Binary Translation (DBT). But even fast simulators
do not reach native execution speed. They do however allow
executing rich Operating System (OS) kernels, which is typically
infeasible when another OS is already running.
Executing multiple OSs on shared physical hardware is typically accomplished by using virtualization, which has a long
history on x86 hardware. It enables encapsulated, native code
execution on the host processor and has been extensively used
in data centers, where many users share hardware resources.
When it comes to embedded systems, virtualization has been
made available recently. For ARM processors, virtualization
was introduced with the ARM Virtualization Extensions for the
ARMv7 architecture. Since virtualization allows native guest code
execution, near-native execution speeds can be reached.
In this work we present a VP containing a novel ARMv8
SystemC Transaction Level Modeling 2.0 (TLM) compatible
processor model. The model leverages the ARM Virtualization
Extensions (VE) via the Linux Kernel-based Virtual Machine
(KVM) to execute the target software natively on an ARMv8
host. To enable the integration of the processor model into a
loosely-timed VP, we developed an accurate instruction counting
mechanism using the ARM Performance Monitors Extension
(PMU). The requirements for integrating the processor mode
into a VP and the integration process are detailed in this work.
Our evaluations show that speedups of up to 2.57x over
state-of-the-art DBT-based simulator can be achieved using our
processor model on ARMv8 hardware.