-
Title:
08-08 Stocastic Environment Problem
-
Description:
Unit 8 8 Stocastic Environment Problem
-
[Norvig] Now let's move on to stochastic environments.
-
Let's consider a robot that has slippery wheels
-
so that sometimes when you make a movement--a left or a right action--
-
the wheels slip and you stay in the same location.
-
And sometimes they work and you arrive where you expected to go.
-
And let's assume that the suck action always works perfectly.
-
We get a belief state space that looks something like this.
-
Notice that the results of actions will often result in a belief state
-
that's larger than it was before--that is, the action will increase uncertainty
-
because we don't know what the result of the action is going to be.
-
And so here for each of the individual world states belonging to a belief state,
-
we have multiple outcomes for the action, and that's what stochastic means.
-
And so we end up with a larger belief state here.
-
But in terms of the observation, the same thing holds as in the deterministic world.
-
The observation partitions the belief state into smaller belief states.
-
So in a stochastic partially observable environment,
-
the actions tend to increase uncertainty,
-
and the observations tend to bring that uncertainty back down.
-
Now, how would we do planning in this type of environment?
-
I haven't told you yet, so you won't know the answer for sure,
-
but I want you to try to figure it out anyways, even if you might get the answer wrong.
-
Imagine I had the whole belief state from which I've diagrammed just a little bit here
-
and I wanted to know how to get from this belief state
-
to one in which all squares are clean.
-
So I'm going to give you some possible plans,
-
and I want you to tell me whether you think each of these plans will always work
-
or maybe sometimes work depending on how the stochasticity works out.
-
Here are the possible plans.
-
Remember I'm starting here, and I want to know how to get to a belief state
-
in which all the squares are clean.
-
One possibility is suck right and suck, one is right suck left suck,
-
one is suck right right suck,
-
and the other is suck right suck right suck.
-
So some of these actions might take you out of this little belief state here,
-
but just use what you knew from the previous definition of the state space
-
and the results of each of those actions
-
and the fact that the right and left actions are nondeterministic
-
and tell me which of these you think will always achieve the goal
-
or will maybe achieve the goal.
-
And then I want you to also answer for the fill-in-the-blank plan--
-
that is, is there some plan, some ideal plan, which always or maybe achieves the goal?