arXiv:2110.03906v5 Announce Type: replace-cross Abstract: The convergence properties of learning dynamics in repeated auctions is a timely and important question, with numerous applications in, e.g., online advertising markets. This work focuses on repeated first-price auctions where bidders with fixed values learn to bid using mean-based algorithms -- a large class of online learning algorithms that include popular no-regret algorithms such as Multiplicative Weights Update and Follow the Perturbed Leader. We completely characterize the learning dynamics of mean-based algorithms, under two notions of convergence: (1) time-average: the fraction of rounds where bidders play a Nash equilibrium converges to 1; (2) last-iterate: the mixed strategy profile of bidders converges to a Nash equilibrium. Specifically, the results depend on the number of bidders with the highest value: - If the number is at least three, the dynamics almost surely converges to a Nash equilibrium of the auction, in both time-average and last-iterate. - If the number is two, the dynamics almost surely converges to a Nash equilibrium in time-average but not necessarily last-iterate. - If the number is one, the dynamics may not converge to a Nash equilibrium in time-average or last-iterate. Our discovery opens up new possibilities in the study of the convergence of learning dynamics.
