Workshop

ML on a budget: IoT, Mobile and other tiny-ML applications

Manik Varma · Venkatesh Saligrama · Prateek Jain

Project Page

Abstract

We routinely encounter scenarios where at test-time we must predict on a budget. Feature costs in Internet, Healthcare, and Surveillance applications arise due to feature extraction time and feature/sensor acquisition~\cite{trapeznikov:2013b} costs. Data analytics applications in mobile devices are often performed on remote cloud services due to the limited device capabilities, which imposes memory/prediction time costs. Naturally, in these settings, one needs to carefully understand the trade-off between accuracy and prediction cost. Uncertainty in the observations, which is typical in such scenarios, further adds to complexity of the task and requires a careful understanding of both the uncertainty as well as accuracy-cost tradeoffs.

In this workshop, we aim to bring together researchers from various domains to discuss the key aspects of the above mentioned emerging and critical topic. The goal is to provide a platform where ML/statistics/optimization researchers can interact closely with domain experts who need to deploy ML models in resource-constrained settings (like an IoT device maker), and chart out the foundational problems in the area and key tools that can be used to solve them.

Motivation
===================
Prediction under budget constraints is a critical problem that arise in several settings like medical diagnosis, search engines and surveillance. In these applications, budget constraints arise as a result of limits on computational cost, time, network-throughput and power-consumption. For instance, in search engines CPU cost during prediction-time must be budgeted to enable business models such as online advertising. Additionally, search engines have time constraints at prediction-time as users are known to abandon the service is the response time of the search engine is not within a few tens of milliseconds. In another example, modern passenger screening systems impose constraints on throughput.

An extreme version of these problems appear in the Internet of Things (IoT) setting where one requires prediction on tiny IoT devices which might have at most 2KB of RAM and no floating point computation unit. IoT is considered to be the next multi-billion industry with “smart” devices being designed for production-line, cars, retail stores, and even for toothbrush and spoons. Given that IoT based solutions seem destined to significantly permeate our day-to-day lives, ML based predictions on the device become critical due to several reasons like privacy, battery, latency etc.

Learning under resource constraints departs from the traditional machine learning setting and introduces new exciting challenges. For instance, features are accompanied by costs (e.g. extraction time in search engines or true monetary values in medical diagnosis) and their amortized sum is constrained at test-time. Also, different search strategies in prediction can have widely varying computational costs (e.g., binary search, linear search, dynamic programming). In other settings, a system must maintain a throughput constraint to keep pace with arriving traffic.
In IoT setting, the model itself has to be deployed on a 2-16KB RAM, posing an extremely challenging constraint on the algorithm.

The common aspect of all of these settings is that we must seeks trade-offs between prediction accuracy and prediction cost. Studying this tradeoff is an inherent challenge that needs to be investigated in a principled fashion in order to invent practically relevant machine learning algorithms. This problems lies at the intersection of ML, statistics, stochastic control and information theory. We aim to draw researchers working on foundational, algorithmic and application problems within these areas. We plan on organizing a demo session which would showcase ML algorithms running live on various resource-constrained device, demonstrating their effectiveness on challenging real-world tasks. In addition, we plan to invite Ofer Dekel from Microsoft Research to present a new platform for deploying ML on tiny devices which should provide a easy way to deploy and compare various ML techniques on realistic devices and further spur multiple research directions in this area.

Video

Chat is not available.

Schedule

Timezone: America/Los_Angeles