Institute for Communication Technologies and Embedded Systems

NEWROMAP: mapping CNNs to NoC-interconnected self-contained data-flow accelerators for edge-AI

Authors:
Joseph, J. M. ,  Baloglu, M. S. ,  Pan, Y. ,  Leupers, R. ,  Bamberg, L.
Video:
Book Title:
NOCS '21: Proceedings of the 15th IEEE/ACM International Symposium on Networks-on-Chip
Pages:
p. 15–20
Date:
Oct. 2021
DOI:
10.1145/3479876.3481591
hsb:
RWTH-2021-11817
Language:
English
Abstract:
Conventional AI accelerators are limited by von-Neumann bottlenecks for edge workloads. Domain-specific accelerators (often neuromorphic) solve this by applying near/in-memory computing, NoC-interconnected massive-multicore setups, and data-flow computation. This requires an effective mapping of neural networks (i.e, an assignment of network layers to cores) to balance resources/memory, computation, and NoC traffic. Here, we introduce a mapping called Snake for the predominant convolutional neural networks (CNNs). It utilizes the feed-forward nature of CNNs by folding layers to spatially adjacent cores. We achieve a total NoC bandwidth improvement of up to 3.8X for MobileNet and ResNet vs. random mappings. Furthermore, NEWROMAP is proposed that continues to optimize Snake mapping through a meta-heuristic; it also simulates the NoC traffic and can work with TensorFlow models. The communication is further optimized with up to 22.52% latency improvement vs. pure snake mapping shown in simulations.
Download:
BibTeX