0:00:00.000,0:00:02.000
And the answer is easily obtained if you just
0:00:02.000,0:00:05.000
subtract -3 for each step.
0:00:05.000,0:00:08.000
We get 88 and 85 over here.
0:00:08.000,0:00:11.000
We could also reach the same value going around here.
0:00:11.000,0:00:13.000
So, 85 would have been the right answer,
0:00:13.000,0:00:16.000
and this will be the value function after convergence.
0:00:16.000,0:00:24.000
It's beautiful to see that the value function is effective
0:00:24.000,0:00:26.000
the distance to the positive absorbing state times 3
0:00:26.000,0:00:28.000
subtracted from 100.
0:00:28.000,0:00:32.000
So, we have 97, 94, 91, 88, 85 and so on.
0:00:32.000,0:00:34.000
This is a degenerate case.
0:00:34.000,0:00:36.000
If we have a deterministic state transition function,
0:00:36.000,9:59:59.000
it gets more tricky to calculate for the stochastic case.