•  
  •  
 

Journal of System Simulation

Abstract

Abstract: Deep neural network model is difficult to effectively deploy in embedded terminals due to its excessive number of components, andone of the solutions is model miniaturization (such as model quantization, knowledge distillation, etc.). To address this problem, a quantization training algorithm (referred to as LSQ-BN algorithm) based on adaptive learning of quantizationscale factors with BN folding is proposed.A single CNN (convolutional neural) is usedtoconstruct BN folding and achieve BN and CNN fusion. During the process of quantitative training,the quantization scale factors are set as model parameters. An adaptive quantizationscale factor initialization scheme is proposed to solve the problem of difficult initialization of quantizationscale factors.The experimental results show that the precision of the quantized model is almost the same as that of the FP32 prefabricated model when the weight and activation are both 8bit quantization. When the weight is 4 bit quantization and the activation is 8bit quantization, the precision loss of the quantization model is within 3%. Therefore, LSQ-BN proposed in this paper is an excellent model quantization algorithm.

First Page

1639

Revised Date

2021-06-09

Last Page

1650

CLC

TP391

Recommended Citation

Hui Nie, Kangshun Li, Yang Su. A Quantization Training Algorithm of Adaptive Learning Quantization Scale Fators[J]. Journal of System Simulation, 2022, 34(7): 1639-1650.

Corresponding Author

Kangshun Li,likangshun@sina.com

DOI

10.16182/j.issn1004731x.joss.21-0175

Share

COinS