Poster Mon, Jul 6, 2026 • 6:30 PM – 8:15 PM PDT HALL A #900

SCalDA: Semantics-Calibrated and Diffusion-Enhanced Data Augmentation

Shibo Lv ⋅ Jianmin Jiang

Abstract

With the rapid development of deep learning, the issue of data scarcity has become increasingly prominent, inspiring emerging interests towards research on data augmentation techniques over recent years. However, our literature survey indicates that existing efforts often suffer from two issues of semantic infidelity, including: (i) visual semantics infidelity, such as visual artifacts, manifold intrusion, and unnatural blending boundaries etc, and (ii) label semantic infidelity, where augmented images do not match the original labels, creating extra label noises. To address these issues, we propose a Semantics Calibrated and Diffusion-Enhanced Augmentation (SCalDA) scheme to achieve accurate semantics calibration across image, label and feature domains. Compared with the existing approaches, our proposed features in precise guidance in label domain, semantics driven synthesis across three domains (image, label and feature), and semantics-aware metric learning. Extensive experiments on multiple datasets demonstrate that SCalDA yields consistent and significant performance improvements for both fine-grained and general classification tasks, validating the effectiveness and broad applicability of the proposed.

Lay Summary

Data augmentation is a common way to improve deep learning models when training data is limited. However, many existing methods suffer from a basic problem: the new images they create do not faithfully preserve semantics. Some introduce visible artifacts, unnatural boundaries, or distort the natural image structure, while others assign labels that no longer match what the image actually contains, creating extra noise during training. To address this, we propose SCalDA, a data augmentation method that better aligns image semantics, label semantics, and feature semantics. It combines more precise label guidance, diffusion-based semantics-driven image synthesis, and semantics-aware metric learning to improve the quality of augmented samples. Experiments on multiple datasets show that SCalDA brings consistent and significant gains on both fine-grained and general classification tasks.