FakeWorld 1.0: An Omni modal Benchmark for Fake Media and Content
Abstract
The accelerating realism of AI-generated content has amplified the spread of deceptive information and eroded public trust. Prior works typically split the problem into two tracks, media authenticity, which concerns whether content is real or AI-generated, and content veracity, which concerns semantic and factual correctness, thereby missing their joint effects in practice. We present FakeWorld 1.0, which deeply fuses these two orthogonal axes into a unified omni-modal benchmark. Along the media axis, FakeWorld spans text, audio, image, and video synthesis; along the content axis, it instantiates cross-modal semantic inconsistencies and factual errors. These axes are jointly instantiated within realistic web-based and streaming-style presentation scenarios, reflecting how multimodal deception is composed and delivered in real-world settings. FakeWorld provides explainable annotations in the form of per-instance rationales, enabling transparent, evidence-based diagnosis. Under a unified protocol, our evaluation of open- and closed-source MLLMs exposes capacity limits and highlights FakeWorld’s effectiveness at surfacing mixed-source, high-fidelity deception. Beyond the benchmark, we introduce OmniCheck, a unified omni-modal agentic workflow that performs explainable detection across both axes and outputs evidence-backed reports. We aim for FakeWorld 1.0 to serve as a realistic stress test and a practical foundation for building future systems that enable scalable, explainable detection of fake multimodal content.