High-value payments systems (HVPSs) are used to settle transactions between large financial institutions and are considered the core national financial infrastructure. In collaboration with the Bank of Canada, we have been exploring the use of reinforcement learning techniques to understand the behaviour of banks participating in the Canadian HVPS. This understanding could help regulators design policies to ensure the safety and efficiency of these systems.