Research — Edward Kim

Active Projects

Home-based Stroke Physical Rehabilitation in Augmented Reality

In collaboration with Stanford and UC San Francisco Medical Centers, I am leading projects which develop medical augmented reality (AR) apps that would increase accessibility of care to stroke patients, who often suffer from long waitlists to be seen by physical and occupational therapists or have difficulty commuting to clinics. In particular, I develop personalized physical training algorithms in AR which adapts the curriculum and corrective feedback to each stroke patient. The research agenda span developing (i) formal specifications to evaluate motor functions of patients, (ii) automatic analysis of the evaluation to provide intuitive visual and verbal feedback to correct the movements of the patients, and (iii) algorithmic online adaptation of the curriculum of motor training tasks according to the evaluation results. The realistic variations of motor training tasks are formally modeled as probabilistic programs using SCENIC. Each SCENIC program represents a distribution of the same motor task. A personalization of the curriculum involves not only the selection of which motor task, or probabilistic program, to train with but also which task to sample within the program. Our study entails carefully designed user studies to understand the strengths and the limitations of our algorithms. We recently completed a pilot clinical study with stroke patients, taking the first step toward the research agenda.

Jose Lima, Yuri Cho, Julie Muccini, Edward Kim, Alan Gallegos, Alton Sturgis, James Hu, Cathy
Zhang, Nick Perlich, Sanjit Seshia, Maarten Lansberg, “Modified Fugl Meyer Assessment in Augmented Reality,” American Academy of Neurology (AAN), 2024 [To Appear]

Motor Skill Training in Augmented Reality

In collaboration with Meta, I am leading a project which aims to create a platform for users to learn motor skills from experts. Platforms like Youtube, Coursera, and Khan Academy have democratized and personalized cognitive knowledge (e.g. math, economics, coding). However, there is none for motor skills. The objective of this research is to enable experts (e.g. soccer coaches, chefs) with no programming backgrounds to author, or create, AR tutorials via physical demonstrations and verbal explanations. We aim to develop algorithm which learns how to solve particular motor tasks from demonstrations and descriptions and symbolically encodes what it learned as a probabilistic program. For example, a soccer coach wears an AR headset and demonstrates how to defend against opponent players as shown in this demo video 1. To scalably teach motor skills to large number of students, this synthesized probabilistic program controls the movements of a proxy avatar to tutor the skill in lieu of experts (demo video 2). The research agenda spans developing (i) a symbolic learning algorithm that synthesizes a probabilistic program from physical demonstrations and verbal explanations, and (ii) a large language model (LLM) based multi-modal user interface that actively elicits necessary information from expert coaches for disambiguation of candidate programs from synthesis.

Previous Projects

Abdus Salam Azad*, Edward Kim*, Mark Wu, Kimin Lee, Ion Stoica, Pieter Abbeel, Alberto Sangiovanni-Vincentelli, Sanjit A. Seshia, “Programmatic Modeling and Generation of Real-time Strategic Soccer Environments for Reinforcement Learning,” Association for the Advancement of Artificial Intelligence (AAAI), 2022 [* indicates equal contributions]

Programmatic Modeling and Generation of Real-time Strategic Soccer Environments for Reinforcement Learning

The capability of a reinforcement learning (RL) agent heavily depends on the diversity of the learning scenarios generated by the environment. Generation of diverse realistic scenarios is challenging for real-time strategy (RTS) environments. The RTS environments are characterized by intelligent entities/non-RL agents cooperating and competing with the RL agents with large state and action spaces over a long period of time, resulting in an infinite space of feasible, but not necessarily realistic, scenarios involving complex interaction among different RL and non-RL agents. Yet, most of the existing simulators rely on randomly generating the environments based on predefined settings/layouts and offer limited flexibility and control over the environment dynamics for researchers to generate diverse, realistic scenarios as per their demand. To address this issue, for the first time, we formally introduce the benefits of adopting an existing formal scenario specification language, SCENIC, to assist researchers to model and generate diverse scenarios in an RTS environment in a flexible, systematic, and programmatic manner.

Querying Labelled Time Series Sensor Data with Probabilistic Programs

Nowadays, across academia and industry, terabytes of sensor data are being collected and labelled. While the size of data matters, so are the contents of the data. For example, if a self-driving car’s training dataset does not contain any contents of unprotected left turn scenarios, this can pose a safety hazard. However, manually searching through terabytes of data for such scenarios is certainly not scalable. This project aims to develop an algorithm for querying and retrieving time series sensor data (e.g. RGB, LiDAR, Radar) which contains a scenario of interest. We formally model distributions of scenarios as probabilistic programs using SCENIC. Our algorithm specifically queries the corresponding semantic labels to retrieve time series sensor data which matches the scenario, i.e. a SCENIC program. Previously, I developed this query algorithm which retrieves static (image) sensor data. I am extending this prior work to time series data.

Edward Kim, Jay Shenoy, Sebastian Junges, Daniel Fremont, Alberto Sangiovanni-Vincentelli,
Sanjit A. Seshia, “Querying Labelled Data with Scenario Programs for Sim-to-Real Validation,” International Conference on Cyber-Physical Systems (ICCPS), 2022 [Best Paper Nominee]

Parallel and Multi-Objective Falsification with SCENIC and VerifAI

Falsification has emerged as an important tool for simulation-based verification of autonomous systems. In this paper, we present extensions to the SCENIC scenario specification language and VERIFAI toolkit that improve the scalability of sampling-based falsification methods by using parallelism and extend falsification to multi-objective specifications. We first present a parallelized framework that is interfaced with both the simulation and sampling capabilities of SCENIC and the falsification capabilities of VERIFAI, reducing the execution time bottleneck inherently present in simulation-based testing. We then present an extension of VERIFAI’s falsification algorithms to support multi-objective optimization during sampling, using the concept of rulebooks to specify a preference ordering over multiple metrics that can be used to guide the counterexample search process. Lastly, we evaluate the benefits of these extensions with a comprehensive set of benchmarks written in the SCENIC language.

Kesav Viswanadha, Edward Kim, Francis Indaheng, Daniel J. Fremont, Sanjit A. Seshia, “Parallel and Multi-Objective Falsification with Scenic and VerifAI,” International Conference on Runtime Verification (RV), 2021, pg. 265-276. [co-advised Kesav Viswanadha]

A Programmatic and Semantic Approach to Explaining and Debugging Neural Networks

Even as deep neural networks have become very effective for tasks in vision and perception, it remains difficult to explain and debug their behavior. In this paper, we present a programmatic and semantic approach to explaining, understanding, and debugging the correct and incorrect behaviors of a neural network-based perception system. Our approach is semantic in that it employs a high-level representation of the distribution of environment scenarios that the detector is intended to work on. It is programmatic in that scenario representation is a program in a domain-specific probabilistic programming language which can be used to generate synthetic data to test a given perception module. Our framework assesses the performance of a perception module to identify correct and incorrect detections, extracts rules from those results that semantically characterizes the correct and incorrect scenarios, and then specializes the probabilistic program with those rules in order to more precisely characterize the scenarios in which the perception module operates correctly or not.

Edward Kim, Divya Gopinath, Corina Pasareanu, and Sanjit A. Seshia, “A Programmatic Approach to Explaining and Debugging Neural Network Based Object Detector,” Computer Vision and Pattern Recognition (CVPR), 2020, pg. 11128-11137, [Oral Presentation]

Scenic: A Language for Scenario Specification and Data Generation

While cyber physical systems (CPS) such as self-driving cars are increasingly operating in more complex environments, it is challenging to model and generate such environments in simulation to extensively test and preventatively identify any system failures. These environments are characterized by stochastic physical interactions with spatial and temporal, or in short, spatio-temporal, constraints. It is not easy to model the stochastic interactions with general probabilistic programming languages as they do not provide the means to model and sample from a distribution with spatio-temporal constraints. Therefore, we propose a new domain-specific probabilistic programming language called SCENIC which provides such supports to model and generate distributions of physical scenarios. This language is simulator-agnostic and has been interfaced to simulators in various domains such as aviation, robotics, self-driving, reinforcement learning, and augmented reality. SCENIC is integrated to CARLA, a popular self-driving simulator developed by Intel, as an official scenario modeling language.

Daniel J. Fremont, Edward Kim, Tommaso Dreossi, Shromona Ghosh, Xiangyu Yue, Alberto
Sangiovanni-Vincentelli, Sanjit A Seshia, “SCENIC: A language for scenario specification and data generation,” A Special Issue of Machine Learning Journal on Robust ML, 2022, pg. 1-45
Eric Vin, Shun Kashiwa, Matthew Rhea, Daniel Fremont, Edward Kim, Tommaso Dreossi, Shormona Ghosh, Xiangyu Yue, Alberto Sangiovanni-Vincentelli, Sanjit A. Seshia, “3D Environment Modeling for Falsification and Beyond with Scenic 3.0,” International Conference on Computer-Aided Verification (CAV), 2023