Publications

Stabilizing LTI Systems under Partial Observability: Sample Complexity and Fundamental Limits

Published in Neurips, 2025

We study the problem of stabilizing an unknown partially observable linear time-invariant (LTI) system. For fully observable systems, leveraging an unstable/stable subspace decomposition approach, state-of-art sample complexity is independent from system dimension n and only scales with respect to the dimension of the unstable subspace. However, it remains open whether such sample complexity can be achieved for partially observable systems because such systems do not admit a uniquely identifiable unstable subspace. In this paper, we propose LTS-P, a novel technique that leverages compressed singular value decomposition (SVD) on the ‘‘lifted’’ Hankel matrix to estimate the unstable subsystem up to an unknown transformation. Then, we design a stabilizing controller that integrates a robust stabilizing controller for the unstable mode and a small-gain-type assumption on the stable subspace. We show that LTS-P stabilizes unknown partially observable LTI systems with state-of-the-art sample complexity that is dimension-free and only scales with the number of unstable modes, which significantly reduces data requirements for high-dimensional systems with many stable modes.

Download here

Predictive Control and Regret Analysis of Non-Stationary MDP with Look-ahead Information

Published in TMLR, 2025

Policy design in non-stationary Markov Decision Processes (MDPs) is inherently challenging due to the complexities introduced by time-varying system transition and reward, which make it difficult for learners to determine the optimal actions for maximizing cumulative future rewards. Fortunately, in many practical applications, such as energy systems, look-ahead predictions are available, including forecasts for renewable energy generation and demand. In this paper, we leverage these look-ahead predictions and propose an algorithm designed to achieve low regret in non-stationary MDPs by incorporating such predictions. Our theoretical analysis demonstrates that, under certain assumptions, the regret decreases exponentially as the look-ahead window expands. When the system prediction is subject to error, the regret does not explode even if the prediction error grows sub-exponentially as a function of the prediction horizon. We validate our approach through simulations, confirming the efficacy of our algorithm in non-stationary environments.

Download here

Fast Bandit-based Policy Adaptation in Diverse Environments

Published in ACC, 2025

Autonomous systems must have the ability to quickly adapt to various situations. However, adaptation methods often require strong assumptions about system structures, environmental homogeneity, and multiple rollouts. In this work, we integrate multi-armed bandit and model-based RL to design a fast adaptation algorithm on a single trajectory. Our approach achieves sublinear regret, and the performance guarantee does not require homogeneity of the environment. This regret bound is achieved using a novel prediction error metric that is minimized in the ground-truth MDP. To the best of our knowledge, all existing results with provable guarantees depend on the Bregman divergence between the optimal policies among the MDP’s. We show by simulation that our algorithm performs well in puzzle navigation and quadcopter path-tracking.

Download here

Sample Complexity of Stabilizing LTI Systems on a Single Trajectory under Stochastic Noise

Published in UAI, 2025

We study the problem of learning to stabilize unknown noisy Linear Time-Invariant (LTI) systems on a single trajectory. It is well known in the literature that the learn-to-stabilize problem suffers from exponential blow-up in which the state norm blows up exponentially in the state dimension. This blow-up is due to the open-loop instability when exploring the n-dimensional state space. To address this issue, we develop a novel algorithm that decouples the unstable subspace of the LTI system from the stable subspace, based on which the algorithm only explores and stabilizes the unstable subspace, the dimension of which can be much smaller than n. With a new singular-value-decomposition(SVD)-based analytical framework, we prove that the system is stabilized before the state norm is only exponential in the order of the dimension of the unstable subspace and is advantagous if the unstable subspace is small. Critically, this bound avoids exponential blow-up in state dimension as in the previous works, and to the best of our knowledge, this is the first paper to avoid exponential blow-up in dimension for stabilizing LTI systems with noise.

Download here

Polyhedra of small relative mixed volume

Published in Contribution to Algebra and Geometry, 2020

We classify all tuples of lattice polyhedra of relative mixed volume 1 and all minimal (by inclusion) tuples of polyhedra of relative mixed volume 2. We also prove a conjecture by Esterov, which states that all tuples with finite relative mixed volume are contained in one of finitely many ones that are minimal by inclusion.

Download here

Minimal flag triangulations of lower-dimensional manifolds

Published in Involve. A Journal of Mathematics, 2020

Download here

Ziyi Zhang

Publications

Stabilizing LTI Systems under Partial Observability: Sample Complexity and Fundamental Limits

Predictive Control and Regret Analysis of Non-Stationary MDP with Look-ahead Information

Fast Bandit-based Policy Adaptation in Diverse Environments

Sample Complexity of Stabilizing LTI Systems on a Single Trajectory under Stochastic Noise

Polyhedra of small relative mixed volume

Minimal flag triangulations of lower-dimensional manifolds