Improve Model Inference Cost with Image Gridding
Abstract
The success of AI has spurred the rise of Machine Learning as a Service (MLaaS), where companies develop, maintain, and serve general-purpose models such as object detectors and image classifiers for users that pay a fixed rate per inference. As more organizations rely on AI, the MLaaS market is set to expand, necessitating cost optimization for these services. We explore how a simple yet effective method of increasing model efficiency, aggregating multiple images into a grid before inference, can significantly reduce the required number of inferences for processing a batch of images with varying drops in accuracy. Experiments on open-source and commercial models show that image gridding reduces inferences by 50%, while maintaining low impact on mean average precision (mAP) over the Pascal VOC object detection task.