We study revenue optimization learning algorithms for repeated second-price auctions with reserve where a seller interacts with multiple strategic bidders each of which holds a fixed private valuation for a good and seeks to maximize his expected future cumulative discounted surplus.
We propose a novel algorithm that has strategic regret upper bound of $O(\log\log T)$ for worst-case valuations.
This pricing is based on our novel transformation that upgrades an algorithm designed for the setup with a single buyer to the multi-buyer case.
We provide theoretical guarantees on the ability of a transformed algorithm to learn the valuation of a strategic buyer, which has uncertainty about
the future due to the presence of rivals.