DefQ: Defensive Quantization Against Inference Slow-Down Attack for Edge Computing
15 February 2023
The novel multiexit deep neural network (DNN) architectures provide a new optimization solution for efficient model inference in edge systems. Inference of most samples can be completed within the first few layers on an edge device without the need to transmit them to a remote server. This can significantly increase the inference speed and system throughput, which is particularly beneficial to the resource-constrained scenarios. Unfortunately, researchers proposed an inference slow-down attack against this technique, where an external adversary can add imperceptible perturbations on clean samples to invalidate the multiexit mechanism. In this article, we propose a defensive quantization (DefQ) method as the first defense against the inference slow-down attack. It is designed to be lightweight and can be easily implemented in off-the-shelf camera sensors. Particularly, DefQ introduces a novel quantization operation to preprocess the input images. It is capable of removing the perturbations from the malicious samples and preserving the correct inference exit points and prediction accuracy. Meanwhile, it has little impact on the clean samples. Extensive evaluations show that DefQ can effectively defeat the inference slow-down attack and well protect the efficiency of edge systems.