[Script Info]
Title:
[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:00.00,0:00:02.00,Default,,0000,0000,0000,,Here is my solution.
Dialogue: 0,0:00:02.00,0:00:05.00,Default,,0000,0000,0000,,As I go through all different actions a, as before,
Dialogue: 0,0:00:05.00,0:00:10.00,Default,,0000,0000,0000,,I now create a new inner loop of going through different action outcomes.
Dialogue: 0,0:00:10.00,0:00:14.00,Default,,0000,0000,0000,,This lists is (-1, 0, 1),
Dialogue: 0,0:00:14.00,0:00:17.00,Default,,0000,0000,0000,,and I set the actual outcome to the adjacent action in the action list.
Dialogue: 0,0:00:17.00,0:00:21.00,Default,,0000,0000,0000,,You might remember the action list is a list of different outcomes.
Dialogue: 0,0:00:21.00,0:00:27.00,Default,,0000,0000,0000,,By incrementing it by 1 or decrementing it by 1, I can pick a slightly different action in that list.
Dialogue: 0,0:00:27.00,0:00:30.00,Default,,0000,0000,0000,,Of course, I have to do the modulo 4 on the right side.
Dialogue: 0,0:00:30.00,0:00:36.00,Default,,0000,0000,0000,,Then the limitation is similar to before. I project the outcome into new coordinates--x2 and y2.
Dialogue: 0,0:00:36.00,0:00:39.00,Default,,0000,0000,0000,,Now I need to assign the probability with this outcome
Dialogue: 0,0:00:39.00,0:00:42.00,Default,,0000,0000,0000,,where if they modify a 0, we take the success probability.
Dialogue: 0,0:00:42.00,0:00:49.00,Default,,0000,0000,0000,,If it's not 0, we take 1 minus that divided by 2, because there are 2 possible undesired outcomes.
Dialogue: 0,0:00:49.00,0:00:52.00,Default,,0000,0000,0000,,Then the test proceeds by checking whether this is a legal grid cell,
Dialogue: 0,0:00:52.00,0:00:55.00,Default,,0000,0000,0000,,it's inside the grid, and the grid value is 0.
Dialogue: 0,0:00:55.00,0:00:59.00,Default,,0000,0000,0000,,Then like before, I add the value of the grid cell
Dialogue: 0,0:00:59.00,0:01:03.00,Default,,0000,0000,0000,,by now multiplying by the probability of that specific action outcome.
Dialogue: 0,0:01:03.00,0:01:06.00,Default,,0000,0000,0000,,Otherwise, I do the same for the collision cost.
Dialogue: 0,0:01:06.00,0:01:12.00,Default,,0000,0000,0000,,Finally, I take my cumulative value of v2, which I initialized with the cost of motion.
Dialogue: 0,0:01:12.00,0:01:14.00,Default,,0000,0000,0000,,You can't see this right here, but it's filled up.
Dialogue: 0,0:01:14.00,0:01:17.00,Default,,0000,0000,0000,,I update my value function just like before.
Dialogue: 0,0:01:17.00,0:01:19.00,Default,,0000,0000,0000,,You can see the quote over here.
Dialogue: 0,0:01:19.00,0:01:21.00,Default,,0000,0000,0000,,This is what you should have programmed.
Dialogue: 0,0:01:21.00,0:01:26.00,Default,,0000,0000,0000,,The key difference to our example in class is the inner loop over here
Dialogue: 0,0:01:26.00,0:01:29.00,Default,,0000,0000,0000,,where I go over different possible action outcomes,
Dialogue: 0,0:01:29.00,0:01:32.00,Default,,0000,0000,0000,,compute the actual action outcome,
Dialogue: 0,0:01:32.00,9:59:59.99,Default,,0000,0000,0000,,and then do the probabilistic addition of these outcomes rather than just studying one outcome.