A Review of State-of-the-Art Mixed-Precision Neural Network Frameworks

Mariam Rakka; Mohammed E Fouda; Pramod Khargonekar; Fadi Kurdahi

doi:10.1109/TPAMI.2024.3394390

A Review of State-of-the-Art Mixed-Precision Neural Network Frameworks

IEEE Trans Pattern Anal Mach Intell. 2024 Apr 29:PP. doi: 10.1109/TPAMI.2024.3394390. Online ahead of print.

Authors

Mariam Rakka, Mohammed E Fouda, Pramod Khargonekar, Fadi Kurdahi

PMID: 38683716
DOI: 10.1109/TPAMI.2024.3394390

Abstract

Mixed-precision Deep Neural Networks (DNNs) provide an efficient solution for hardware deployment, especially under resource constraints, while maintaining model accuracy. Identifying the ideal bit precision for each layer, however, remains a challenge given the vast array of models, datasets, and quantization schemes, leading to an expansive search space. Recent literature has addressed this challenge, resulting in several promising frameworks. This paper offers a comprehensive overview of the standard quantization classifications prevalent in existing studies. A detailed survey of current mixed-precision frameworks is provided, with an in-depth comparative analysis highlighting their respective merits and limitations. The paper concludes with insights into potential avenues for future research in this domain.