Rose E. Wang

I am a third year PhD student at Stanford University, advised by Noah Goodman. I am grateful to be funded by the NSF Graduate Research Fellowship.

During my undergraduate studies at MIT, I worked with Professor Josh Tenenbaum, Professor Jonathan How, Google Brain (student researcher) and Google Brain Robotics (internship). In a prior lifetime, I was a passionate multilinguist (Chinese, HSK Level 6; French, DELF B2; Spanish, DELE B2) and graduated with honors from Germany (Abitur with European plurilingual excellence award).

[ Email  /  Github  /  Twitter  /  Google Scholar  /  Blog ]

profile photo



kts In the ZONE: Measuring difficulty and progression in curriculum generation
Rose E. Wang, Jesse Mu, Dilip Arumugam, Natasha Jaques, Noah Goodman
NeurIPS 2022: Deep Reinforcement Learning Workshop.
[ Paper ]

A common strategy in curriculum generation for reinforcement learning is to train a teacher network to generate tasks that enable student learning. But, what kind of tasks enables this? One answer is tasks belonging to a student's zone of proximal development (ZPD), a concept from developmental psychology. These are tasks that are not too easy and not too hard for the student. Albeit intuitive, ZPD is not well understood computationally. We propose ZONE, a novel computational framework that operationalizes ZPD. It formalizes ZPD through the language of Bayesian probability theory, revealing that tasks should be selected by difficulty (the student's probability of task success) and learning progression (the degree of change in the student's model parameters).

elign ELIGN: Expectation Alignment as a Multi-Agent Intrinsic Reward
Zixian Ma, Rose E. Wang, Li Fei-Fei, Michael Bernstein, Ranjay Krishna
36th Conference on Neural Information Processing Systems (NeurIPS 2022).
[ Paper / Code ]

Modern multi-agent reinforcement learning frameworks rely on centralized training and reward shaping to perform well. However, centralized training and dense rewards are not readily available in the real world. Current multi-agent algorithms struggle to learn in the alternative setup of decentralized training or sparse rewards. To address these issues, we propose a self-supervised intrinsic reward ELIGN - expectation alignment - inspired by the self-organization principle in Zoology.

kts Know Thy Student: Interactive Learning with Gaussian Processes
Rose E. Wang, Mike Wu, Noah Goodman
ICLR 2022 Workshop on From Cells to Societies: Collective Learning across Scales.
[ Paper ]

Learning often involves interaction between multiple agents. Human teacher-student settings best illustrate how interactions result in efficient knowledge passing where the teacher constructs a curriculum based on their students' abilities. Prior work in machine teaching studies how the teacher should construct optimal teaching datasets assuming the teacher knows everything about the student. However, in the real world, the teacher doesn't have complete information and must probe before teaching. Our work proposes a simple probing algorithm which uses Gaussian processes for inferring student-related information, before constructing a teaching dataset. Our experiments highlight the importance of probing before teaching, demonstrate how students can learn much more efficiently with the help of an interactive teacher, and outline where probing combined with machine teaching would be more desirable than passive learning.

hpp Language modeling via stochastic processes
Rose E. Wang, Esin Durmus, Noah Goodman, Tatsunori Hashimoto,
International Conference for Learning Representations (ICLR) 2022.
Oral Presentation (1.6% oral acceptance rate)
[ Paper / Video / Code ]

Modern language models can generate high-quality short texts. However, they often meander or are incoherent when generating longer texts. These issues arise from the next-token-only language modeling objective. To address these issues, we introduce Time Control (TC), a language model that implicitly plans via a latent stochastic process. TC does this by learning a representation which maps the dynamics of how text changes in a document to the dynamics of a stochastic process of interest. Using this representation, the language model can generate text by first implicitly generating a document plan via a stochastic process, and then generating text that is consistent with this latent plan.

hpp Calibrate your listeners! Robust communication-based training for pragmatic speakers
Rose E. Wang, Julia White, Jesse Mu, Noah Goodman
Findings of EMNLP 2021.
[ Paper / Video / Code ]

To be good conversational partners, natural language processing (NLP) systems should be trained to produce contextually useful utterances. Prior work has investigated training NLP systems with communication-based objectives, where a neural listener stands in as a communication partner. However, these systems commonly suffer from semantic drift where the learned language diverges radically from natural language. We propose a method that uses a population of neural listeners to regularize speaker training.

hpp On the opportunities and risks of foundation models
Many authors..., Rose E. Wang, more authors,...
August 2021.

This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations)..

hpp Too many cooks: Bayesian inference for coordinating multi-agent collaboration
Rose E. Wang*, Sarah Wu*, James A. Evans, Joshua B. Tenenbaum, David C. Parkes, Max Kleiman-Weiner
Journal of the Cognitive Science Society, April 2021.
NeurIPS 2020 Cooperative AI workshop.
Won best paper award at NeurIPS 2020 Cooperative AI Workshop!
[ Paper / Video / Code ]

We develop Bayesian Delegation, a decentralized multi-agent learning mechanism that enables agents to rapidly infer the sub-tasks of others by inverse planning. We demonstrate that our model is a capable ad-hoc collaborator, scales with team size and makes inferences about intent similar to human observers.

hpp Model-based Reinforcement Learning for Multiagent Goal Alignment
Rose E. Wang, J.Chase Kew, Dennis Lee, Tsang-Wei Edward Lee, Tingnan Zhang, Brian Ichter, Jie Tan, Aleksandra Faust
Conference on Robot Learning (CoRL) 2020.
Mentioned in Google AI Year in Review, 2020.
[ Paper / Video / Project Page / Blog post ]

In this work, we present hierarchical predictive planning (HPP) for decentralized multiagent navigation tasks. Our approach is trained in simulation and works in unseen settings both in simulation and in the real world (zero shot transfer)!

hpp Too many cooks: Coordinating multi-agent collaboration through inverse planning
Rose E. Wang*, Sarah Wu*, James A. Evans, Joshua B. Tenenbaum, David C. Parkes, Max Kleiman-Weiner
Human-Like Machine Intelligence (book published with Oxford University Press)
Annual Meeting of the Cognitive Science Society (CogSci) 2020
International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) 2020
Invited paper to OptLearnMAS Workshop at AAMAS 2020
Won best paper award for Computational Modeling for Higher Cognition at CogSci 2020!
[ Paper / Video / Code ]

We develop Bayesian Delegation, a decentralized multi-agent learning mechanism that enables agents to rapidly infer the sub-tasks of others by inverse planning.

rmaddpg R-MADDPG for Partially Observable Environments and Limited Communication
Rose E. Wang, Michael Everett, Jonathan P. How
International Conference on Machine Learning (ICML) 2019, Reinforcement Learning for Real Life Workshop
[ Paper / Code / Project Page ]

This paper introduces a deep recurrent multiagent actor-critic framework (R-MADDPG) for handling multiagent coordination under partial observable settings and limited communication.

rc66 DRIV3N: Race to Autonomy
Rose E. Wang, Austin Floyd, Marwa Abdulhai, Luxas Novak, David Klee, Sean Patrick Kelley
Robotics: Science and Systems I, 2017.
[ Video / Project Page ]

A whirlwind of an experience where my team and I developed a fast, autonomous, ~maze-solving~ racecars equipped with no machine learning technology and a decorative safety controller.

Template from this website.