Timezone: »

Achieving High TinyML Accuracy through Selective Cloud Interactions
Anil Kag · Igor Fedorov · Aditya Gangrade · Paul Whatmough · Venkatesh Saligrama

Fri Jul 22 02:00 PM -- 02:15 PM (PDT) @
Edge devices provide inference on predictive tasks to many end-users. However, deploying neural networks that achieve state-of-the-art accuracy on edge is infeasible due to resource constraints. Nevertheless, cloud-only processing is also problematic since uploading large amounts of data imposes severe communication bottlenecks. We propose a novel end-to-end hybrid learning framework that allows the edge to selectively query only those hard examples that the cloud classifies correctly. It trains edge, cloud predictors, and routing to maximize accuracy while minimizing the latency. Training a hybrid learner is difficult since we lack annotations of hard edge-examples. We introduce a novel proxy supervision in this context and show that our method adapts near optimally across different latency regimes. On the ImageNet dataset, our proposed method deployed on a micro-controller unit exhibits $25\%$ reduction in latency compared to cloud-only processing while suffering no excess loss.