Keynote Speakers

|
|
Jehee Lee
Seoul National University, South Korea
Jehee Lee is a Professor in the Department of Computer Science and Engineering at Seoul National University. His research centers on understanding, simulating, planning, and synthesizing the motion of humans and animals. He is internationally recognized for his pioneering work in modeling and simulating the human musculoskeletal system. He has served in key roles for premier conferences, including Technical Papers Chair for SIGGRAPH Asia 2022 and Test-of-Time Award Chair for ACM SIGGRAPH 2023.
|
Title: Generative GaitNet and Beyond: Foundational Models for Human Motion Analysis and Simulation
Abstract: Understanding the relationship between human anatomy and motion is fundamental to effective gait analysis, realistic motion simulation, and the creation of human body digital twins. We will begin with Generative GaitNet (SIGGRAPH 2022), a foundational model for human gait that drives a comprehensive full-body musculoskeletal system comprising 304 Hill-type musculotendons. Generative GaitNet is a pre-trained, integrated system of artificial neural networks that operates in a 618-dimensional continuous space defined by anatomical factors (e.g., mass distribution, body proportions, bone deformities, and muscle deficits) and gait parameters (e.g., stride and cadence). Given specific anatomy and gait conditions, the model generates corresponding gait cycles via real-time physics-based simulation. Next, we will discuss Bidirectional GaitNet (SIGGRAPH 2023), which consists of forward and backward models. The forward model predicts the gait pattern of an individual based on their physical characteristics, while the backward model infers physical conditions from observed gait patterns. Finally, we will present MAGNET (Muscle Activation Generation Networks)—another foundational model (SIGGRAPH 2025)—designed to reconstruct full-body muscle activations across a wide range of human motions. We will demonstrate its ability to accurately predict muscle activations from motions captured in video footage. We will conclude by discussing how these foundational models collectively contribute to the development of human body digital twins, and explore their future potential in personalized rehabilitation, surgery planning, and human-centered simulation.

|
|
Gerard Pons-Moll
University of Tübingen, Germany
Gerard Pons-Moll is a Professor at the University of Tübingen endowed by the Carl Zeiss Foundation, at the department of Computer Science. He is also core faculty at the Tübingen AI Center, senior researcher at the Max Planck for Informatics (MPII) in Saarbrücken, Germany, and faculty at the IMPRS-IS (International Max Planck Research School - Intelligent Systems in Tübingen). His research lies at the intersection of computer vision, computer graphics and machine learning -- with special focus on analyzing people in videos, and creating virtual human models by "looking" at real ones. His research has produced some of the most advanced statistical human body models of pose, shape, soft-tissue and clothing (which are currently used for a number of applications in industry and research), as well as algorithms to track and reconstruct 3D people models from images, video, depth, and IMUs.
His work has received several awards including the prestigious Emmy Noether Grant (2018), a Google Faculty Research Award (2019,2024), a Facebook Reality Labs Faculty Award (2018,2024), the German Pattern Recognition Award (2019), which is given annually by the German Pattern Recognition Society to one outstanding researcher in the fields of Computer Vision and Machine Learning. His work got Best Papers Awards BMVC’13, Eurographics’17, 3DV’18, 3DV’22 and CVPR’20, ECCV’22 and has been published at the top venues and journals including CVPR, ICCV, Siggraph, Eurographics, 3DV, IJCV and PAMI. He serves regularly as area chair for the major conferences in learning and vision and is associate editor of PAMI.
|
Title: How to train large scale 3D human and object foundation models
Abstract: Understanding 3D humans interacting with the world has been a long standing goal in AI and computer vision for decades. Lack of 3D data has been the major barrier of progress. This is changing with the increasing number of 3D datasets featuring images, videos and multi-view with 3D annotations, as well as large-scale image foundation models. However, learning models from such sources is non-trivial. Some of the challenges are: 1) Datasets are annotated with different 3D skeleton formats and outputs, 2) image foundation models are 2D and extracting 3D information from them is hard. I will present solutions to each of these 2 challenges. I will introduce a universal training procedure to consume any skeleton format, a diffusion based method tailored to lift foundation models to 3D (human and also general objects), and a mechanism to probe 3D foundation model features in geometry and texture awareness based on 3D Gaussian splatting reconstruction. I will also show a method to systematically create 3D human benchmarks on demand for evaluation (STAGE).

|
|
Xubo Yang
Shanghai Jiao Tong University, China
Xubo Yang is a Professor for Virtual/Augmented Reality and Computer Graphics at the School of Software at Shanghai Jiao Tong University. He leads the Digital ART(Augmented Reality Tech) Laboratory. He received a Ph.D. (1998) in Computer Graphics at the State Key Lab of CAD & CG of Zhejiang University. From 1998 to 2001, he was a research scientist at the Virtual Environment Group of Fraunhofer Institute for Media Communication in Germany. From 2001 to 2003, Xubo worked as a research fellow at the Mixed Reality Lab of National University of Singapore. During 2012 to 2013, he worked as a visiting professor at the Department of Computer Science at the University of North Carolina at Chapel Hill. Xubo’s research interests include next-generation media art computing technologies in the context of virtual reality, augmented reality, computer graphics and novel interactive techniques. He has published many peer-reviewed papers in the field of computer graphics, virtual reality, augmented reality and mixed reality.
|
Title: To be announced
Abstract: To be announced.
|