In this paper we propose a unified framework for structured prediction with hidden variables which includes hidden conditional random fields and latent structural support vector machines as special cases. We describe an approximation for this general structured prediction formulation using duality, which is based on a local entropy approximation. We then derive an efficient message passing algorithm that is guaranteed to converge, and demonstrate its effectiveness in the tasks of image segmentation as well as 3D indoor scene understanding from single images, showing that our approach is superior to latent structural support vector machines and hidden conditional random fields.