A Novel Automate Python Edge-to-Edge: From Automated Generation on Cloud to User Application Deployment on Edge of Deep Neural Networks for Low Power IoT Systems FPGA-Based Acceleration

Tarek Belabed; Vitor Ramos Gomes da Silva; Alexandre Quenon; Carlos Valderamma; Chokri Souani

doi:10.3390/s21186050

A Novel Automate Python Edge-to-Edge: From Automated Generation on Cloud to User Application Deployment on Edge of Deep Neural Networks for Low Power IoT Systems FPGA-Based Acceleration

Sensors (Basel). 2021 Sep 9;21(18):6050. doi: 10.3390/s21186050.

Authors

Tarek Belabed^{1

2

3}, Vitor Ramos Gomes da Silva¹, Alexandre Quenon¹, Carlos Valderamma¹, Chokri Souani⁴

Affiliations

¹ Electronics and Microelectronics Unit (SEMi), University of Mons, 7000 Mons, Belgium.
² Ecole Nationale d'Ingénieurs de Sousse, Université de Sousse, Sousse 4000, Tunisia.
³ Laboratoire de Microélectronique et Instrumentation, Faculté des Sciences de Monastir, Université de Monastir, Monastir 5019, Tunisia.
⁴ Institut Supérieur des Sciences Appliquées et de Technologie de Sousse, Université de Sousse, Sousse 4003, Tunisia.

Abstract

Deep Neural Networks (DNNs) deployment for IoT Edge applications requires strong skills in hardware and software. In this paper, a novel design framework fully automated for Edge applications is proposed to perform such a deployment on System-on-Chips. Based on a high-level Python interface that mimics the leading Deep Learning software frameworks, it offers an easy way to implement a hardware-accelerated DNN on an FPGA. To do this, our design methodology covers the three main phases: (a) customization: where the user specifies the optimizations needed on each DNN layer, (b) generation: the framework generates on the Cloud the necessary binaries for both FPGA and software parts, and (c) deployment: the SoC on the Edge receives the resulting files serving to program the FPGA and related Python libraries for user applications. Among the study cases, an optimized DNN for the MNIST database can speed up more than 60× a software version on the ZYNQ 7020 SoC and still consume less than 0.43W. A comparison with the state-of-the-art frameworks demonstrates that our methodology offers the best trade-off between throughput, power consumption, and system cost.

Keywords: Python framework; cloud computing; deep neural networks (DNNs); edge computing; field programmable gate array (FPGA); hardware acceleration; high-level synthesis (HLS) tools; internet of things (IoT); low-cost; low-power.

MeSH terms

Acceleration
Computers
Neural Networks, Computer*
Software*