Speaker
Details

Lunch is available beginning at 12 PM
Speaker to begin promptly at 12:30 PM
Abstract: Generative AI has the potential to transform the efficiency of critical processes in broad domains, ranging from healthcare, to education, to logistics. However, such transformations require holistic integration of AI systems within complex workflows, characterized by a range of critical constraints. The extremely high upfront cost and complexity of running such AI systems on today’s computing architectures poses a key limitation to the progressive and iterative understanding and pursuit of such integration, towards broad adoption of AI in practical use cases. This talk starts by taking a fundamental approach to understanding the complexity of running state-of-the-art generative AI workloads, focusing on how such workloads are differentiated in their computing requirements from past workloads, and how this is being addressed in today’s architectures. This talk outlines some of the major emerging trends, at the fundamental technology level, being pursued to address critical components of the computing architectures, including processor, memory, and network, to respond to the needs of current and future Generative AI workloads. In doing so, the talk dives into the promising approach of in-memory computing (IMC), looking at how traditional hardware trade-offs can be altered and what this means for future systems. Alongside this, the talk also looks at major trends in Generative AI, namely the dramatic reduction in model size, projecting how these can alter the challenges and opportunities for addressing computing constraints.
Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.