**1.**

a)

From conditional probability:

P(A|B) = P(A n B) / P(B)

and

P(B|A) = P(B n A) / P(A)

but

P(B n A) = P(A n B)

therefore

P(A|B)P(B) = P(B|A)P(A)

we rearrange

**P(A|B) = P(A)P(B|A) / P(B) <—> Bayes Theorem.**

_

b) **Kalman Filter**.

The error in the sensor is specified to be perfectly gaussian (and unimodal). This makes a kalman filter the perfect solution, which, with enough sensor readings, should be able to fit the noise with no approximation or error.

All three will work, to some degree, but the table based and particle filter will introduce discretization and sampling errors respectively.

_

c) A probability distribution over possible rock densities

_

d) All densities are equally probable. ie `P(d=0) = 0.25, P(d=2) = 0.25, P(d=4) = 0.25, P(d=6) = 0.25`

_

e) Sensor returns three. We look at column 2 < x < 8 and multiply our state representation by the probabilities in that column.

d=0 | d=2 | d=4 | d=6 | what |
---|---|---|---|---|

0.25 | 0.25 | 0.25 | 0.25 | previous state |

0.23 | 0.22 | 0.15 | 0.07 | get values from 2 < x < 8 |

0.0575 | 0.055 | 0.0375 | 0.0175 | multiply to get total = 0.1675 |

0.34 | 0.33 | 0.22 | 0.10 | normalize to get new state |

new state is ( p(d=0) = 0.34 , p(d=2) = 0.33 , p(d=4) = 0.22 , p(d=6) = 0.1 )

_

f) max( P(d=0),P(d=2),P(d=4),P(d=6) )

**d=0** with p(d=0) = 0.34

_

g) as with *e* except sensor returns -1 and therefore we look at column 0 for *x < 2*

d=0 | d=2 | d=4 | d=6 | what |
---|---|---|---|---|

0.34 | 0.33 | 0.22 | 0.10 | previous state |

0.46 | 0.24 | 0.1 | 0.03 | get values from x < 2 |

0.1564 | 0.0792 | 0.022 | 0.003 | multiply to get total = 0.2606 |

0.60 | 0.30 | 0.08 | 0.01 | normalize to get new state |

new state is ( p(d=0) = 0.60 , p(d=2) = 0.30 , p(d=4) = 0.08 , p(d=6) = 0.01 )

_

h) max( P(d=0),P(d=2),P(d=4),P(d=6) )

**d=0** with p(d=0) = 0.6

_

**2.**

a) *HAVE NO IDEA HERE. GUESSING. ANYONE HAVE ANY IDEA?*

— A fitness function a la genetic algorithms scoring a policy

— A reinforcement learning function rewarding individual behaviours

— A specific state that an agent must reach to complete and leave a simulation

_

b) *ALSO GUESSING*

— A set of actions that are predetermined and provided to an agent to enact

— A policy which specifies how an agent should behave in a stochastic or partially observable world

— Partial order plan

_

**3.**

Nick's answer (from my notes from Will):

a)

**Reactive**

- Constant time feedback loops
- Executes specific actions from execution engine

b)

**Execution Engine**

- Ties planning to reactive layer
- "bounded time" - no loops
- Maps plan steps to reactive actions

c)

**Deliberative Layer**

- Performs planning (STRIPS)
- outputs abstract plan
- Operates in arbitrary time

Sam's answer:

a)

**Sensing layer** - receive information from the world and organize it in a way to be passed into the next layer.

_

b)

**Planning layer** - takes sensor reading from the sensor layer, interprets them, and maps them to actions.

_

c)

**Exectution layer** - takes the actions specified in the planning layer, and maps them to specific commands to be issued to the robot actuators.

_

**4.**

a)

Note: in the following State(Heuristic)

Nodes ordered by smallest->highest(heuristic)

{ A(6) } -> Pop A

{ B(4), D(4.5), C(5) } -> Pop B

{ F(2), E(3), D(4.5), C(5) } -> Pop F

{ H(0), E(3), D(4.5), C(5) } -> Pop H

Goal reached.

_

b)

Note: in the following State(Dist Travelled So Far, Heuristic)

Nodes ordered by smallest->highest(Dist Travelled + Heuristic)

{ A(0, 6) } -> Pop A

{ C(1.2, 5), B(3, 4), D(3.7, 4.5) } -> Pop C

{ B(3, 4), F(4.6, 2), D(3.7, 4.5) } -> Pop B

{ F(4.6, 2), E(4.3, 3), D(3.7, 4.5) } -> Pop F

{ H(6.6,0), E(4.3, 3), D(3.7, 4.5) } -> Pop H

Goal Reached.

_

**5.**

a) A* is only guaranteed to find an optimal path through the graph when using an admissible (optimistic) heuristic. If Dijkstra's algorithm specifies a heuristic value of 0 for all nodes, but node transitions can have a negative value, it is possible for the heuristic of 0 to actually be GREATER than the path cost. In this case, the heuristic will not be admissible and the proof for A* will not apply.

_

b)

if

`f(goal) = g(goal) + h(goal) = C`

then

`f(n) = g(n) + h(n) <= C`

we see that

`a * f(n) <= a * C`

and

`a * f(n) <= a * g(n) + a * h(n)`

and we know that

`f'(n) = g(n) + a * h(n)`

if

`a >= 1, g(n) >= 0, h(n) >= 0`

then we see

`g(n) + a * h(n) <= a * g(n) + a * h(n)`

and therefore

`f'(n) <= a*C`

Therefore the upper bound of f'(n) is a*C

The optimal path and therefore lower bound of f'(n) is found when considering pure A*. This cost is C as given.

therefore the bound of f'(n) is (C .. a*C)

_

c)

Useful for when you want to have an easily tweakable relaxation parameter. By setting alpha equal to 1 the algorithm becomes A*, by increasing alpha beyond 1 the constraints of the algorithm are less strictly enforced, which may reduce the number of nodes explored without ensuring that the path found is optimal.

_

d)

To prove that you have found the optimal path, you must examine all nodes with a function value (g(n) + h(n)) less than or equal to the actual optimal path length from the start to the goal (where h(n) is admissable). By arbitrarily missing some of these nodes, you cannot confirm that the optimal path has been found. For example, what if a node on the optimal path is being ignored? Unless there is an equal length alternative path, the optimal path has been missed.

_

e)

To lessen the strictness around the constraints of a heuristic. For example, you can relax an admissable heuristic in a graph search by allowing heuristic values that are slightly larger than the actual cost of traversing two nodes.