Timezone: »

Improve Model Inference Cost with Image Gridding
Shreyas Krishnaswamy · Lisa Dunlap · Lingjiao Chen · Matei Zaharia · James Zou · Joseph Gonzalez

The success of AI has spurred the rise of Machine Learning as a Service (MLaaS), where companies develop, maintain, and serve general-purpose models such as object detectors and image classifiers for users that pay a fixed rate per inference. As more organizations rely on AI, the MLaaS market is set to expand, necessitating cost optimization for these services. We explore how a simple yet effective method of increasing model efficiency, aggregating multiple images into a grid before inference, can significantly reduce the required number of inferences for processing a batch of images with varying drops in accuracy. Experiments on open-source and commercial models show that image gridding reduces inferences by 50%, while maintaining low impact on mean average precision (mAP) over the Pascal VOC object detection task.

Author Information

Shreyas Krishnaswamy (University of California, Berkeley)
Shreyas Krishnaswamy

Shreyas Krishnaswamy is an MS student at UC Berkeley working on ML/Systems to improve the efficiency of serving and querying ML models in the cloud.

Lisa Dunlap (UC Berkeley)
Lingjiao Chen (University of Wisconsin-Madison)
Matei Zaharia (Stanford and Databricks)
James Zou (Stanford University)
Joseph Gonzalez (UC Berkeley)

More from the Same Authors