- Authors:
-
Joseph,
J. M.
,
Baloglu,
M. S.
,
Pan,
Y.
,
Leupers,
R.
,
Bamberg,
L.
- Book Title:
- NOCS '21: Proceedings of the 15th IEEE/ACM International Symposium on Networks-on-Chip
- Pages:
-
p.
15–20
- Date:
-
Oct. 2021
- DOI:
- 10.1145/3479876.3481591
- hsb:
- RWTH-2021-11817
- Language:
- English
Abstract
Conventional AI accelerators are limited by von-Neumann bottlenecks for edge workloads. Domain-specific accelerators (often neuromorphic) solve this by applying near/in-memory computing, NoC-interconnected massive-multicore setups, and data-flow computation. This requires an effective mapping of neural networks (i.e, an assignment of network layers to cores) to balance resources/memory, computation, and NoC traffic. Here, we introduce a mapping called Snake for the predominant convolutional neural networks (CNNs). It utilizes the feed-forward nature of CNNs by folding layers to spatially adjacent cores. We achieve a total NoC bandwidth improvement of up to 3.8X for MobileNet and ResNet vs. random mappings. Furthermore, NEWROMAP is proposed that continues to optimize Snake mapping through a meta-heuristic; it also simulates the NoC traffic and can work with TensorFlow models. The communication is further optimized with up to 22.52% latency improvement vs. pure snake mapping shown in simulations.
Download
BibTeX