PATCHCODE: Discrete Latent Predictive Learning for EEG Foundation Model
Abstract
EEG foundation models aim to learn transferable representations, yet EEG recordings are dominated by high-frequency noise and large cross-subject variability. Existing pretraining strategies such as masked autoencoding or autoregressive modeling often treat waveform reconstruction as the learning signal, making the objective sensitive to stochastic fluctuations rather than consistent neurophysiological structure. To address this overlap, we propose \textbf{PATCHCODE}, a region-aware discrete predictive learning framework that keeps the encoder input continuous while introducing region-aware discrete codes as stable supervision targets. We pretrain a masked predictive encoder on continuous EEG patches with dual-granularity learning: it predicts missing patch-level representations to preserve fine spatiotemporal structure, while aligning them to discretized code targets from a frozen tokenizer to anchor robust semantics. Extensive Experiments across ten downstream datasets spanning emotion recognition, motor imagery, sleep staging, and seizure detection demonstrate that PATCHCODE achieves competitive performance compared to state-of-the-art baselines, with notable gains in data efficiency under limited labels. Our code is available at https://anonymous.4open.science/r/PATCHCODE-323D/.