代写Assignment 1 (Autumn Term 2024)代写留学生Python语言

2024-11-14 代写Assignment 1 (Autumn Term 2024)代写留学生Python语言

Assignment 1 (Autumn Term 2024)

Questions

1. Vacuum cleaner

Consider the Simple Reflex Agent in the Two-square Vacuum-cleaner World studied in  w01_vacuum_world_revised.ipynb in your seminar

(linked here, see code blocks 30 and 31 for the environment and agent definitions). We modify the environment so that after dirt has been cleaned in a square, new dirt will be generated in the same square after Y time steps, where Y ∼ Geom(p) for p = 1/15 (i.e. if the square is clean at     any time point, the probability of having dirt in the square at the next time point isp). Let RT denote the average reward in T time steps for this     Simple Reflex Agent.

i.  Write down the PEAS description of the task environment (using the reward function as in the IPython notebook). Is the environment fully observable? Is the environment deterministic?

ii.  Compute limT T −1ERT . You may derive it mathematically or write a piece of code to provide an approximate solution (to two decimal places) for this part. You do not need to submit any code used.

iii.  Is the simple reflex agent described in (ii) rational? Why or why not?

iv.  Explain why model-based agents can be more suitable for this particular environment.

2. Maze exploration

Consider the following maze in a 7-by-7 grid world. An agent is exploring the maze from

At each square, the agent will always try  forward ,  turn left and  turn right in this order. For example, after reaching square  X , the agent will try first to move East, and then try to move North (which it cannot move due to a wall), before trying to move South.

i.  If the agent explores using a depth-first graph search algorithm, write down the order in which it visits squares  A ,  B ,  C ,  D ,  E and  F before reaching the  Finish (not all squares may be visited).

ii.  If the agent explores using a breadth-first graph search algorithm, write down the order in which it visits squares  A ,  B ,  C ,  D ,  E and  F before reaching the  Finish (not all squares may be visited).

iii.  In the previous two parts, will your answers change if the agent used depth-first tree search and breadth-first tree search respectively? Why?

iv.  The agent now uses an A* search algorithm with the heuristic function

h(x) = (x1(−)f1(−−))2(−)(x2(−−)f2(−−),

where x = (x1, x2) is the coordinate value of the current square (with (0,0) for the top-left square) and f = (f1, f2) = (6, 3) is the coordinate value of the  Finish square. Is this heuristic function admissible? Why?

3. Sokoban

In this task, you are required to submit Python code that solves a variant of the Sokoban (Storekeeper) game using the A* search algorithm. You will modify the provided  sokoban.py file to solve the puzzle based on the game rules and provided input.

Game Rules

Game setup: The game is initialised with a 7x7 grid world where each square can either be an obstacle  * , a free square (blank space), a storekeeper  S , a box  B or a target location  # . You may assume that there is only one  S , one  B and one  # , and that the entire

perimeter of the grid world are lined with  * .

Movement: You control the storekeeper. In each step, your action is one of ‘N’ (move north), ‘S’ (move south), ‘E’ (move east), ‘W’ (move west). The storekeeper cannot walk into an obstacle ( * ). The storekeeper can push the box  B (by walking towards it) only if the square   behind the box (in the direction of the push) is free (not an obstacle).

Cost: The floor of the store is sloped so it is difficult to push the box North. Each push in the North has a cost of 0.2, and a push in South, East or West has a cost of 0.1. In addition, each non-pushing movement of the storekeeper incurs a cost of 0.01.

Example

The 7x7 character array in  input0.txt encodes the initial game configuration as shown in the leftmost panel below:

A sequence of moves  NNWNNESSSNNNEESSSSWW would have solved the puzzle (though not necessarily optimally) with a total cost of 0.2 × 1 + 0.1 × 4 + 0.01 × 15 = 0.75. Some intermediate game states are shown above.

Your Task

You are provided with three files on Moodle:

.   sokoban.py : main code file for solving the Sokoban puzzle using A* search.

search4e.py : a slight modification of code in  search4e.ipynb in your AIMA library that implements various search algorithms. input0.txt : an example input instance.

Please place all three files in the same working directory. Your task is to fill in the blanks in sokoban.py so that the code reads in an input in the same format as  input0.txt and outputs a character string encoding the sequence of actions for solving the Sokoban puzzle. You only need to submit  sokoban.py , renamed using your candidate number according to the submission instruction below. Please do not modify code outside designated blanks in  sokoban.py and do not change  search4e.py , otherwise your submitted code may not run properly.