Ein Angebot des JRZ ISIA
Die Reading Group hat den Charakter eines lockeren Fachseminars mit Vorträgen einer Länge zwischen 30 und 45 Minuten und anschließender Diskussion. In freundlicher Atmosphäre und unter weitestgehender Themenfreiheit werden etwa Forschungsergebnisse, wissenschaftliche oder technologische Überblicksvorträge oder die Aufbereitung einer Forschungsfrage präsentiert und diskutiert. Der Fokus liegt im Austausch, in der Diskussion. Organisatorische Fragen, Anmeldungen zu Vorträgen oder Vortragsvorschläge bitte an Stefan Huber richten.
Termine 2026
June, 24th, 2026 | 15.15 pm | HS 154 | Stefan Huber (Head of Research Department IT)
Chebyshev Policies for Low-Dimensional Reinforcement Learning
Abstract: Reinforcement Learning (RL) is one of the powerhouses in machine learning for all sorts of sequential decision making problems, from robot control to tuning large language models. At the same time, we face significant challenges in real-world application (e.g., sample efficiency, interpretability) and significant gaps in theory (e.g., lack of optimal solutions to standard benchmarks). In this talk, we first solve the Mountain Car problem to optimality, closing a gap after 36 years, with two interesting insights: The optimal strategy is very simple, yet the best MLP-based agents perform far from optimality. This motivated us to re-consider the mathematical architecture of RL agents, motivating a couple of first principles, and devise a new policy space based on a multi-variate generalization of Chebyshev policies. It turns out that our Chebyshev policies significantly improve upon every neural policy on every task we evaluated, while using two orders of magnitude less parameters, improving sample efficiency, interpretability, and more. We filed a patent with ABB/B&R based on this work concerning energy-optimized servo control. This work will be presented as an oral contribution at ICML 2026.
Room: HS 154, 15:15pm
If you want to join, please send an email to Stefan Huber.