Jump to content

Leslie P. Kaelbling

From Wikiquote

Leslie Pack Kaelbling is an American roboticist and the Panasonic Professor of Computer Science and Engineering at the Massachusetts Institute of Technology. She is widely recognized for adapting partially observable Markov decision processes from operations research for application in artificial intelligence and robotics. Kaelbling received the IJCAI Computers and Thought Award in 1997 for applying reinforcement learning to embedded control systems and developing programming tools for robot navigation. In 2000, she was elected as a Fellow of the Association for the Advancement of Artificial Intelligence.

Quotes

[edit]

"Doing for our robots what nature did for us" (2020)

[edit]
"Doing for our robots what nature did for us" (August 20, 2020)
  • OK, so what's my research goal? I come from the machines end of this world, roughly. And what I really want to do is figure out how it is that we can make intelligent robots. And I do this mostly because I'm interested in intelligence more than I'm interested actually in robots. But I think that trying to make a physical agent who goes out and interacts in the world is a really good test bed for understanding what kinds of reasoning and perception and control we need in order to make it an intelligent system.
  • So the way I think about the problems-- this is kind of a definitely a computer scientist way to think about the problem-- is to think about the robot as a transducer, as some kind of a system that's connected up to the world. And it makes observations of the world. And it takes actions that change the state of the world. And presumably, there's some objective, right? We want to take actions that change the state of the world in some way that we think will be good.
  • The reason I want to start by backing all the way up to this like very basic control theory picture is that right now there's an enormous amount of argument about how one should make robots. Should they do planning and reasoning? Should they do reinforcement learning? How should we do it? So there's a huge kind of crisis almost in the field about what the best methods are. And what I want to start out this talk by doing is actually thinking about how we can answer that question in a way that's not political or religious, but technical.
  • So the way I want to think about this, the job of this program. So I'm going to make a robot. I'm going to put a program in the head of the robot. So let's say, I'm not going to worry about hardware. I'm just going to read about the software. And so the program that I'm going to put in the head of my robot, it has to do this job that's written in the formula up here. And what this is just shorthand for saying is that it has to represent some kind of mapping from observation and actions that it's had in the past. So o, a star means the whole history of observations and actions that it's ever had. Based on that, it has to pick the next action. So that's not really saying much of anything at all. That's just a description of every single robot control program basically that's been written. You have to take your history of actions and observations, compute the next action.
  • And so what we want to do is think about first of all, what's the best-- what would be the best pi to put inside the robot. How can we think about that? And then we have to think about the problem of how is it that we, in my case, as me, as an engineer, I'm going to find that pi that I should put in my robot.
  • So one way to think about the whole problem set up then is that I, as the robotics engineer, have to do for my robots the job that nature did for you. That is to say, I have to think about I'm a robot factory. I'm going to make these robots. And the robots are going to go out in the world. Maybe they're going to go and work in people's Kitchens or something. And every kitchen is going to be different. So there's going to be a lot that I don't know about the world. But somehow, I have to figure out the best program, what program, to put in the head of all my robots, so that when they go out in the world to behave, they can do a good job so that's the way I think about the problem that I face. And in order to think about what would be the best program, I kind of think about it this way.
  • So I imagine that there's some distribution over possible environments that the robot could find itself in when it actually goes out into the world, right? So maybe it's going to go to houses and the houses are all somewhat different. And once I put that program in the house, maybe it's going to do some estimation or learning. It's going to adapt to the circumstances it's in. My job is to find a program that does a good job of adapting in all the environments that might find itself in.
  • So imagine that you have some kind of probability distribution over the worlds that the robot could actually end up operating in. I want to find a program that's going to behave well, let's say get a lot of reward in expectation on average over all the environments that it could possibly find itself in. So that's, I would say, kind of a reasonable formal objective for a robot. And one thing that's good about this as an objective is that we don't have to argue about it, right? It doesn't say whether there should be learning in there or what kind of learning or should it be a genetic algorithm or should it have planning. In some sense, you could say, "I just want to make the program that's going to be the best that can be on average over these environments."
  • But the problem is now I've written down an objective function. I've said, "Oh, if you could tell me a distribution over possible worlds that you'd like this program to work well in, then I know in a certain mathematical sense with the best program is." But now my problem as the engineer, as the person who is in the robot factory, which is again, the kind of maybe analogous to the problem of nature, is I have to figure out how do I how do I find this program that's going to be good and all these situations?
  • So there are a bunch of ways you can think about the problem. I mean, one would be to say, "Oh, I'm really lazy. I don't really want to think very much about working in the factory. It seems awfully hard. I will just make a robot that has roughly an empty head. It doesn't really know very much at all. And then it just has to interact in the world and learn everything by interacting." But of course, you don't really want a robot that comes to your kitchen and begins to learn about physics, right? That would break a lot of dishes.
  • Another strategy-- and this is like the classic engineering strategy-- is that, no, I'm like a serious engineer. And I'm going to sit here and think really, really hard. And I'm going to write a program. And it's going to be a great program. And I'm just going to put it straight in the robot's head. And it's going to go off, and it's going to be awesome and do everything it needs to do. And that strategy actually can work very well in certain kinds of problems. It lets that, the Boston dynamics robots do Parkour. But as we try to address bigger and more complicated problems, it becomes harder and harder for engineers to just straight up write the program.
  • We could just try to figure out how humans work because humans work pretty well in a variety of domains. And so one program would be to say, "Well, we forget how humans work. And then that's what we do. We make robots that work like that." So first of all, that's a hard biology problem. I think it's very important that people work on it. But it's also not a general engineering methodology because for instance, I might want robots that work in certain kinds of circumstances or problem domains that are really different from the niche that humans are well tuned for. And so I might want to make a robot that isn't really human-like in its intelligence. And then it seems like what we're left with that maybe we could just say, well, we'll somehow recapitulate evolution. Like we just search around in the space of programs and try to find ones that work well and then eventually get ones that are great for our environment. But that seems slow and complicated.
  • So if I enumerate my options and they all don't look very good, I don't know what to do. So one thing to think about, though, is this last thing. So the kind of evolution idea. So let's just pursue this a little bit more. So imagine that we want to try to find a program that works well in expectation over all environments. One way to think about that is that inside the factory, we kind of simulate a bunch of environments. We try a bunch of robot programs. And we try to find one that works well in all those environments. And that's like a really interesting strategy. We would have to think of a space of possible programs for the robot, some objective function. We figure out, well, what are we trying to optimize, a distribution over problems to test.
  • In some sense, this is a thing that people have thought about for a long time, right? This would be like running some kind of evolutionary algorithm or some search or simulation inside the factory. And it's very attractive, but I think generally speaking, hard to make work well. So the question is what should I do, right? I could maybe I can set up this whole evolutionary setup somehow. And then I could just snooze for a really long time while some very complicated program tries to figure out the best robot program to put in the head of the robot. But I don't know. I am simultaneously too impatient for that.
  • And so then the question is can I somehow take pieces and parts of all these ideas, some human programming, some robot learning in the wild, some kind of search or evolution offline, some inspiration from humans. Can I take all those things and put them together and see if I can find a way to engineer intelligent robots? So that's basically what I'm up to.
  • I'm going to-- well, no. OK, let me say something about this. So then the one way to view the research agenda is to say that first of all, I'd like to be inspired by what we know about humans. And in particular, I'm very interested in this bulky core knowledge type stuff because that tells me something about what evolution, in some sense, saw fit to engineer into natural intelligences. And if I understand that natural systems seem to be born with a bias or some built in structure to think in terms of other agents, to understand that they move through 3D space, to talk about, think about objects as clumps of matter that cohere, that's a very helpful engineering bias for building a system.
  • I also know just some physics and variance about the worlds that my robot's going to operate in. And maybe humans don't have this built in explicitly, but they almost surely have a built in implicitly. And I also have some other constraints as an engineer who's trying to make intelligent robots, which is that humans are the engineers, right?
  • So if humans have to engineer a very complicated system, then it has to be the engineering process has to have some modularity to it because humans are really bad at understanding one big messy system. They're good at understanding pieces and parts that work together. So it may be that we have to take a modular design approach in our engineering efforts for intelligence, not because the intelligence needs to have that architecture, but because we, the human engineers, need those tools for actually building a system.
  • So all these constraints need to somehow come together into a way of building intelligence systems. Actually, I would stop here for a minute just because it's a convenient spot and see if there are questions. I see some red Q&A button. So maybe someone can ask.
  • Yeah, actually. For years there has been. So a more typical formalization would be in terms of predictive models and planning or reasoning. So reinforcement learning. Also, it depends. The phrase unfortunately, the phrase, "reinforcement learning," grows and stretches too. And sometimes for many people and in many discourses, it's come to mean all of intelligent behavior, in which case, I would say, well, no it's all reinforcement learning. But that's vacuous. Other formulations involve reasoning about objects and their relationships and thinking about the long term consequences of taking actions in the world and so on. So there's really different ways of framing and formalizing the problem. And they give you very different computational profiles and different learning strategies. OK, good.
  • So I'll just tell you some story because people usually like stories, and it's kind of the afternoon. So and this is related to the question about reinforcement learning, probably, right? So how did I get into this whole thing? When I just finished my undergraduate degree, which actually was in philosophy, weirdly enough, I went to work at a research institute while I was starting my PhD. And they had this robot nobody really knew actually very much about robotics. So And it was my job as the brand new person to try to get the robot to drive down the hallway.
  • And so what happened was I programmed the robot. And it would run into the wall. And I would bring it back. And I would fix the programming. And it would run into the wall again, hopefully for a slightly different reason. And over the course of a couple of weeks, I managed to write a program that would use these funny sonar sensors on the robot and make it drive down the hall without crashing into the walls. And so that was good. And I was happy, in a way, at the end of that, that I had gotten it to work. But I reflected on that a bit more. And what I decided was that I had learned how to navigate down the hallway using the sonar sensors. And what I thought was that-- and it had taken a long time. And I was kind of a hassle. And really, the system should have been doing the learning, not me. So my view was that I should figure out a way to get out of the loop to build systems that could learn on their own to do stuff. And then I could just wait for them to do that. And that would be better. So that was flaky.
  • Then I sort of reinvented reinforcement learning in a not very good way, really. But it was kind of entertaining. And I this is a slide by the way for those young people in the audience. You might know, but back in the day, we used to write with colored depends on pieces of clear plastic. And that's what we used to give talks. So I had this kind of pseudo reinforcement learning thing. And by 1990, I actually had this little robot called Spanky that did actual reinforcement learning during my actual defense. So it didn't learn anything too complicated. But it did do it in real time. So that was kind of fun.
  • So OK, I finished my PhD. And I thought, OK, I know something about robot learning now. But I really want to make robots that can do complicated things. And I couldn't figure out how to get basic reinforcement learning methods to really scale up to problems that I cared about. And so this is one last flight. I'll show you from some talk that I gave in 1995. And I kind of complained that the ideal that you could take just a big bunch of what I like to call neural goo now, just a big bunch of generic neural network stuff, and train it to be an intelligent agent all by itself. But that wasn't going to be feasible. And instead, we needed some kind of compositional structure. And that would give us more efficient learning and more robust behavior and so on.
  • So I'm still there, OK? So I'm still in I'm still trying to figure out how we can design an architecture that can learn efficiently. And so the research strategy that I have really adopted, I work closely with a colleague, Tomás Lozano-Pérez. Our strategy has been the following, which is to try to think of some very generic representation and inference mechanisms and build those in and then figure out how to learn the rest of the stuff. And we're all used to I think by now the idea of some representation that inference mechanisms that we would want to build in.
  • For instance, everyone's used to the idea of convolution now in image space. But if you think about it. And I've had people tell me who work on convolutional neural networks that they don't build any structure into their system. It's just a neural network. But of course, as soon as you build the compositional structure into a neural network, you are taking a position on some regularities that are in the input signal and so on. And you're taking advantage of that so that you don't have to learn a whole fully connected network, but you just learn some compositional kernels.
  • So just as convolution gives us a great leverage when you apply it to the right part of the problem, then the intuition as well, hopefully there there's a few mechanisms, hopefully not like 100 mechanisms, but maybe 10. And then if we figure out how to use those mechanisms to bias learning and to structure behavior that we could learn robust ways of behaving that are efficient and so on.
  • So one set of possible kind of general ideas includes convolution in space also and time. Maybe understanding the kinematics of the system that it's connected together in joints and segments, a notion of planning to move through space, being able to do causal reasoning-- if I were to do this, what would happen? -- Abstracting over individual objects, various kinds of state and temporal abstraction and so on. So our view-- I don't want to commit to a particular list-- but is that there's a list of structural principles that are pretty generic and very broadly useful and we should build them in.
  • OK, I'll keep going. I'll surely be able to offend some people soon. And I'll work harder at that. OK. so if we kind of accept this idea that we're going to build in some structure, then what? And the thing that my colleague and I have done recently. Well, now, maybe not super recently, but recent. In order to test out the idea that there's a set of mechanisms that would work well, what we did is we hand built the rest of the system. So we hand-built some transition models, inference rules, ways of doing search control and so on and connected them up to these general mechanisms and made a system.
  • And just again, to kind of give you the motivation. I really want a robot. This isn't my kitchen, by the way, just in case you were worried, my kitchen. But imagine that you had to clean this kitchen or make breakfast in it or something. It would be very hard. And imagine programming a robot to do it. It's extremely hard. And so one thing that's useful to do is to think about what makes this problem hard. So one of the things that makes it hard is that there are lots of objects.
  • So the dimensionality of the space is kind of unthinkably high. It's also not exactly clear what constitutes an object here. If you were going to behave in this world, it would be a very long sequence of primitive actions that you would take in order to clean this kitchen. And also there's just a fundamental amount of uncertainty in this problem, right? So you don't know what's in the blue bowl or what will happen if you try to pull out a certain thing. You don't know when the people are coming home or what they want for dinner all sorts of stuff you don't know.
  • And so any approach that works effectively in a domain like this is going to have to handle very large spaces, very long horizons, and really lots of uncertainty. So we have kind of a standard structural decomposition to this problem. We call this belief space hierarchical planning in the now. I'll decode what that means a little bit.
  • Fundamentally, the way we think about it is that we decompose the computation that's in the robot's head now into two parts. The first part is in charge of taking the sequence, the history of actions and observations, and trying to synthesize them into some representation of a belief for a probability distribution about the way the world might be and then another module that takes that belief and decides how to behave.
[edit]
Wikipedia
Wikipedia
Wikipedia has an article about: