We propose a family of optimization methods that achieve linear convergence using first-order gradient information and constant step sizes on a class of convex functions much larger than the smooth and strongly convex ones. This larger class includes functions whose second derivatives may be singular or unbounded at their minima. Our methods are discretizations of conformal Hamiltonian dynamics, which generalize the classical momentum method to model the motion of a particle with non-standard kinetic energy exposed to a dissipative force and the gradient field of the function of interest. They are first-order in the sense that they require only gradient computation. Yet, crucially the kinetic gradient map can be designed to incorporate weak information about the convex conjugate in a fashion that allows for linear convergence on convex functions that may be non-smooth or non-strongly convex. We study in detail one implicit and two explicit methods. For one explicit method, we provide conditions under which it converges to stationary points of non-convex functions. For all, we provide conditions on the convex function and kinetic energy pair that guarantee linear convergence, and show that these conditions can be satisfied by functions with power growth. In sum, these methods expand the class of convex functions on which linear convergence is possible with first-order computation.
Chris Maddison is a PhD student in the Statistical Machine Learning Group in the Department of Statistics at the University of Oxford supervised by Yee Whye Teh and Arnaud Doucet. He is an Open Philanthropy AI Fellow and spends two days a week as a Research Scientist at DeepMind. Chris is interested in the tools used for inference and optimization in scalable and expressive models. He aims to expand the range of such models by expanding the toolbox needed to work with them. Chris received his MSc. from the University of Toronto, working with Geoffrey Hinton. He received a NIPS Best Paper Award in 2014 and was one of the founding members of the AlphaGo project.