Gene Li, a fifth-year PhD student at TTIC, conducts research in theoretical machine learning and decision-making. Advised by Professor Nati Srebro, Li’s Ph.D. research has been focused on reinforcement learning theory, which is an area of machine learning that enables an AI agent to learn and adapt to dynamic environments.
“If you want to learn from your environment to maximize your rewards, you have to learn to adapt to the environment,” Li said. “How would you do so, and in the best way? The formal paradigm for this is called reinforcement learning, and it is also the focus of my Ph.D. research.”
Examples of successes in reinforcement learning include AlphaGo, a computer program that plays the board game Go at a superhuman level, as well as personalized ad recommendations on the internet, according to Li. Reinforcement learning is also used in medicine to help prescribe new treatments to patients based on historical data.
One of the projects Li is most excited about and involved in is titled “When is Agnostic Reinforcement Learning Statistically Tractable?”
According to Li, theoretical investigations of reinforcement learning tend to make very strong modeling assumptions. One can model how a system evolves over time or the structure of the rewards, which is called the “realizable setting” in machine learning, but in contrast, when one makes fewer assumptions about reality it is called the “agnostic setting.”
“The simplest way to describe it would be to use the example of image classification of dogs versus cats,” Li said. “Typically, people will train a neural network using image data of dogs and cats. They will then use the neural network to predict dogs versus cats for new test images. However, to assume there is some magical neural network that can perfectly predict this is to make an assumption about nature itself. Instead, we know that we can still get strong theoretical guarantees from a more ‘agnostic’ viewpoint in machine learning theory. The main goal of this work is to extend this theoretical framework to reinforcement learning.”
The paper introduces a new complexity measure for reinforcement learning, called the “spanning capacity,” and reveals a surprising separation for agnostic learnability between generative access and online access models.
Li received his bachelor’s degree in electrical engineering at Princeton University. During his time as an undergraduate, AlphaGo was popular in the news, and Li became intrigued.
“At the beginning of undergrad, I didn’t do much machine learning, but I became really interested in the intersection of machine learning and decision-making,” Li said. “The parts of my program in electrical engineering that I enjoyed the most were designing control systems for robotics to accomplish goals like line-following or cruise control.”
Li grew up in the suburbs of Nashville, Tennessee, and has been enjoying graduate student life in Chicago and at TTIC.
“I enjoy exploring different neighborhoods and the lakefront,” Li said. “Biking or running here is really nice. I’ve also been involved with the Math Circles of Chicago program as a teaching assistant where we teach fifth and sixth graders, so it’s been nice to provide some kind of mentorship.”