Skip to content

Latest commit

 

History

History
29 lines (17 loc) · 1.55 KB

2004.12814.md

File metadata and controls

29 lines (17 loc) · 1.55 KB

Why should we add early exits to neural networks?, Scardapane et al., 2020

Paper, Tags: #nlp, #frameworks

DNNs have historically been designed as a stack of differentiable layers where prediction is obtained only after running the full stack. Recently, some contributions have proposed techniques to endow the networks with early exits, allowing to obtain predictions at intermediate points of the stack. Advantages:

  • reduction in inference time
  • reduced tendency to overfitting and vanishing gradients
  • capability of being distributed over multi-tier computation platforms

We describe in a unified fashion the way these architectures can be designed, trained, and actually deployed in time-constrained scenario.

With early exit networks, at each mid-point there's an auxiliary classifier/regressor with a small auxiliary NN, 1/2 layers.

Training NNs with early exits

Inference in multi-exit networks

Biological plausibility of multi-exit networks

Backpropagation in classical neural networks is generally considered to be biologically implausible for multiple reasons, i.e.,: (i) the need to store the outputs for each intermediate layer fi, to re-use it in the backward pass; (ii) the symmetry of weight matrices between forward and backward passes; and (iii) the perfect synchronization required by the process.

Distributed implementation of multi-exit networks

Open research challenges

  • How powerful are multi-output NNs?
  • Full integration in FC/IoT environments
  • What comes after having multiple exits?