Differentiable Weightless Controllers: Learning Logic Circuits for Continuous Control
Abstract
Controlling autonomous systems under real-world conditions often requires policies that can be evaluated with low latency and low energy requirements. Unfortunately, these conditions are at odds with the use of high-precision deep networks as controllers. In this work, we introduce Differentiable Weightless Controllers (DWCs), a symbolic-differentiable architecture that allows learning flexible non-linear yet highly efficient control policies. DWCs can be trained end-to-end by gradient-based techniques, yet compile directly into FPGA-compatible circuits with few- or even single-clock-cycle latency and nanojoule-level energy cost per action for the core computation. Across five MuJoCo benchmarks, including high-dimensional Humanoid, DWCs achieve returns competitive with standard deep policies (full precision or quantized neural networks). Furthermore, DWCs exhibit structurally sparse and interpretable connectivity patterns, enabling a direct inspection of which input values influence control decisions.