Policy Optimization in Control & Filtering

Policy optimization/PL condition/Constellations, ACC 2026 Plenary (in the works)

Constellations and Synchronization

M. Mesbahi and M. Egerstedt, Graph-theoretic Methods in Multi-agent Networks, Princeton University Press, 2010.

PL condition

B. T. Polyak, Gradient Methods for the Minimisation of Functionals, USSR Computational Mathematics and Mathematical Physics, 3 (4): 864–78, 1963.

Hamed Karimi, Julie Nutini, and Mark Schmidt. Linear Convergence of Gradient and Proximal-Gradient Methods under the Polyak-Lojasiewicz Condition. https://doi.org/10.48550/arXiv.1608.04636, 2016.

Guoyin Li and Ting Kei Pong, Calculus of the Exponent of Kurdyka-Lojasiewicz Inequality and Its Applications to Linear Convergence of First-Order Methods, arXiv. https://doi.org/10.48550/arXiv.1602.02915, 2016.

Policy Optimization and Filtering

Shahriar Talebi, Amirhossein Taghvaei, Mehran Mesbahi, Data-driven Optimal Filtering for Linear Systems with Unknown Noise Covariances, NeuroIPS 2023

Policy Optimization and LQR

Jingjing Bu, Afshin Mesbahi, and Mehran Mesbahi. LQR via First Order Flows, American Control Conference, 2020.

Jingjing Bu Afshin Mesbahi, and Mehran Mesbahi. On Topological and Metrical Properties of Stabilizing Feedback Gains: The MIMO Case, arXiv. http://arxiv.org/abs/1904.02737, 2019.

Jingjing Bu, Afshin Mesbahi, and Mehran Mesbahi. Policy Gradient-Based Algorithms for Continuous-Time Linear Quadratic Control, arXiv. http://arxiv.org/abs/2006.09178, 2020.

H. T. Toivonen, A globally convergent algorithm for the optimal constant output feedback problem, https://www-tandfonline-com.offcampus.lib.washington.edu/doi/abs/10.1080/0020718508961217

Jingjing Bu, Afshin Mesbahi, Maryam Fazel, and Mehran Mesbahi. LQR through the Lens of First Order Methods: Discrete-Time Case, arXiv. http://arxiv.org/abs/1907.08921, 2019.

Policy optimization for output feedback and distributed control

William S. Levine, Optimal Output-Feedback Controllers for Linear Systems, https://ntrs.nasa.gov/api/citations/19690011727/downloads/19690011727.pdf, 1969.

Dennis Bernstein, The Optimal Projection Equations for Static and Dynamic Output Feedback: The Singular Case. IEEE Transactions on Automatic Control 32 (12): 1139–43, 1987.

K. Mårtensson, and A. Rantzer. Gradient Methods for Iterative Distributed Control Synthesis. IEEE Conference on Decision and Control, 2009.

M. Mesbahi, M. G. Safonov, and G. P. Papavassilopoulos, Bilinearity and complementarity in robust control, in Advances in Linear Matrix Inequality Methods in Control, SIAM, 2000.

M. Mesbahi and G. P. Papavassilopoulos, Least elements and minimal rank matrices, in Recent Advances in Complementarity Theory - State of the Art, SIAM, 1997.

Riemannian Policy Optimization

Shahriar Talebi, Constrained Policy Synthesis: Riemannian Flows, Online Regulation, and Distributed Games, PhD Dissertation, University of Washington, 2023.

Shahriar Talebi and Mehran Mesbahi, Riemannian Constrained Policy Optimization via Geometric Stability Certificates. IEEE Conference on Decision and Control, 2022.

Shahriar Talebi and Mehran Mesbahi, Policy Optimization over Submanifolds for Linearly Constrained Feedback Synthesis, IEEE Transactions on Automatic Control 69 (5): 3024–39, 2024.

Shahriar Talebi, Yang Zheng, Spencer Kraisler, Na Li, and Mehran Mesbahi, Policy Optimization in Control: Geometry and Algorithmic Implications, arXiv. http://arxiv.org/abs/2406.04243, 2024.

Dynamic Policy Optimization

Policy optimization and games

Jingjing Bu, Lillian J. Ratliff, and Mehran Mesbahi. 2019. “Global Convergence of Policy Gradient for Sequential Zero-Sum Linear Quadratic Dynamic Games, arXiv. http://arxiv.org/abs/1911.04672.

Some newer works related to ACC2026 Plenary