### 2018 Papers

Stochastic model-based minimization of weakly convex functions.
Damek Davis, Dmitriy Drusvyatskiy.
arXiv abs/1803.06523.
This work proves a converge rate of $$O(k^{-1/4})$$ for a wide class of algorithms that are based on sampling one sided models of the function. Primary examples are the stochastic subgradient method and the stochastic prox-linear algorithm for minimizing a composition of a convex function with a smooth map.

Stochastic subgradient method converges on tame functions.
Damek Davis, Dmitriy Drusvyatskiy, Sham Kakade, Jason D. Lee.
arxiv abs/1804.07795.
We prove that the stochastic subgradient method, on any Whitney stratifiable'' locally Lipschitz function, produces limit points that are all first-order stationary. The class of Whitney stratifiable functions is virtually exhaustive in data science, and in particular, includes all popular deep learning architectures.

Global Convergence of Policy Gradient Methods for Linearized Control Problems.
Maryam Fazel, Rong Ge, Sham M. Kakade, Mehran Mesbahi.
arXiv abs/1801.05039.
This work shows that (model free) policy gradient methods globally converge to the optimal solution and are efficient with regards to their sample and computational complexities.

Stochastic subgradient method converges at the rate $$O(k^{−1/4})$$ on weakly convex functions.
Damek Davis, Dmitriy Drusvyatskiy.
arXiv abs/1802.02988.
This work proves that the proximal stochastic subgradient method converges at a rate $$O(k^{-1/4})$$ on weakly convex problems. In particular, it resolves the long-standing open question on the rate of convergence of the proximal stochastic gradient method (without batching) for minimizing a sum of a smooth function and a proximable convex function.

Competitive Online Algorithms for Resource Allocation over the Positive Semidefinite Cone.
Reza Eghbali, James Saunderson, Maryam Fazel.
arXiv abs/1802.01312.
Competitive Online Algorithms for Resource Allocation over the Positive Semidefinite Cone

An optimal first order method based on optimal quadratic averaging.
Dmitriy Drusvyatskiy, Maryam Fazel, Scott Roy.
arXiv abs/1604.06543.
An optimal first order method based on optimal quadratic averaging

Variational Gram Functions- Convex Analysis and Optimization.
Amin Jalali, Maryam Fazel, Lin Xiao.
arXiv abs/1507.04734.
This paper considers a new class of convex penalty functions, called variational Gram functions (VGFs), that can promote pairwise relations, such as orthogonality, among a set of vectors in a vector space.

### 2017 Survey

The Many Faces of Degeneracy in Conic Optimization.
Dmitriy Drusvyatskiy, Henry Wolkowicz.
Foundations and Trends in Optimization, Vol. 3, No. 2, pp 77-170, 2017..
This work surveys the roles of Slater's condition and the facial reduction technique in conic optimization.

The proximal point method revisited.
Dmitriy Drusvyatskiy.
Survey submitted to SIAG/OPT Views and News.
This short survey revisits the role of the proximal point method in large scale optimization, focusing on three recent examples: a proximally guided stochastic subgradient method, the prox-linear algorithm, and Catalyst generic acceleration.

### 2017 Papers

Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice.
Hongzhou Lin, Julien Mairal, Zaid Harchaoui.
arXiv abs/1712.05654.
The paper presents the comprehensive theory and practice of the Catalyst acceleration scheme for accelerating gradient-based convex optimization methods.
Catalyst applies to a large class of optimization algorithms, including gradient descent, block coordinate descent, incremental algorithms such as SAG, SAGA, SDCA, SVRG, Finito/MISO, and their proximal variants.

Invariances and Data Augmentation for Supervised Music Transcription.
John Thickstun, Zaid Harchaoui, Dean Foster, Sham M. Kakade.
arXiv abs/1711.04845.
This paper presents the winning entry to the 2017 MIREX Multiple Fundamental Frequency Estimation evaluation. The method is based on a translation-invariant convolutional architecture.

The nonsmooth landscape of phase retrieval.
Damek Davis, Dmitriy Drusvyatskiy, Courtney Paquette.
arXiv abs/1711.03247.
This paper shows that the standard Polyak subgradient method converges linearly for the phase retrieval problem, when initialized within a constant relative distance of the underlying signal. The second part of the paper analyzes concentration of stationary points of the subsampled objective.

An homotopy method for $$\ell_p$$ regression provably beyond self-concordance and in input-sparsity time.
Sébastien Bubeck, Michael B. Cohen, Yin Tat Lee, Yuanzhi Li.
arXiv abs/1711.01328.
Solving two long open problems about $\ell_p$ ball:
1. The existence of input sparsity time algorithm for lp regression.
2. The inexistence of good self-concordance barrier.

k-server via multiscale entropic regularization.
Sébastien Bubeck, Michael B. Cohen, James R. Lee, Yin Tat Lee, Aleksander Madry.
arXiv abs/1711.01085. [Blog]
A breakthough in the k-server problem. The first o(k)-competitive algorithm for hierarchically separated trees.

Convergence Rate of Riemannian Hamiltonian Monte Carlo and Faster Polytope Volume Computation.
Yin Tat Lee, Santosh S. Vempala.
arXiv abs/1710.06261.
A quadratic improvement of polytope volume computation!

The Paulsen Problem, Continuous Operator Scaling, and Smoothed Analysis.
Tsz Chiu Kwok, Lap Chi Lau, Yin Tat Lee, Akshay Ramachandran.
arXiv abs/1710.02587.
Resolving the Paulsen problem - a basic open problem in frame theory with applications from signal processing, communication theory and theoretical computer science.

Catalyst Acceleration for Gradient-Based Non-Convex Optimization.
Hongzhou Lin, Julien Mairal, Zaid Harchaoui.
arXiv abs/1703.10993.
The paper presents the extension of the Catalyst acceleration scheme for gradient non-convex optimization. When the objective is convex, 4WD-Catalyst enjoys the same properties as the regular Catalyst. When the objective is nonconvex, it achieves the best known convergence rate to stationary points for first-order methods.