Beyond Transformers: Neural Architectures Inspired by Dynamical Systems

Date
Feb 4, 2025, 12:00 pm1:30 pm
Location
Bendheim House 103

Speaker

Details

Event Description

Lunch is available beginning at 12 PM

Speaker to begin promptly at 12:30 PM

Abstract: Can we build neural architectures that go beyond Transformers by leveraging principles from dynamical systems? In this talk, I will introduce a novel approach to sequence modeling that draws inspiration from the emerging paradigm of online control to achieve efficient long-range memory, fast inference, and provable robustness.

At the core of this approach is a new method for learning linear dynamical systems through spectral filtering. This method eliminates the need for learned convolutional filters, remains invariant to system dimensionality, and offers strong theoretical guarantees—all while achieving state-of-the-art performance on long-range sequence tasks.

I will present theoretical insights, empirical results on both synthetic and real-world benchmarks, and recent advancements in fast sequence generation and provable length generalization. The talk will be self-contained and accessible to researchers across STEM disciplines—no prior background in control theory or sequence prediction is required.

Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.

Sponsor
Center for Statistics and Machine Learning and AI for Accelerating Invention