Global Convergence of Policy Gradient Methods for Output Feedback Linear Quadratic Control

Feiran Zhao, Xingyun Fu, Keyou You

Submitted on 8 November 2022


While the optimization landscape of policy gradient methods has been recently investigated for partially observable linear systems in terms of both dynamical controllers and static output feedback, they can only provide convergence guarantees to stationary points. In this paper, we propose a new parameterization of the policy, which uses a past input-output trajectory of finite length as the feedback. We show that the solution set to the parameterized optimization problem is a matrix space, which is invariant to \textit{similarity transformation}. By proving a gradient dominance property, we show the global convergence of policy gradient methods. Moreover, we observe that the gradient is orthogonal to the solution set, revealing an explicit relation between the resulting solution and the initial policy. Finally, we perform simulations to validate our theoretical results.


Comment: Submitted to IFAC World Congress 2023

Subjects: Mathematics - Optimization and Control; Electrical Engineering and Systems Science - Systems and Control