MDP example¶State Value Function Changes in Policy Iterations¶Import Library¶In [1]: import numpy as np Grid World¶In [2]: BOARD_ROWS = 3 # grid world 세로BOARD_COLS = 3 # grid world 가로GAMMA = 1.0POSSIBLE_ACTIONS = [0, 1, 2, 3] # 좌, 우, 상, 하ACTIONS = [(-1, 0), (1, 0), (0, -1), (0, 1)] # 좌표로 나타낸 행동REWARDS = [] Environment¶In [3]: class Env: def __init__(self): self.heig..
Markov Decision Process Example
MDP example¶State Value Function Changes in Policy Iterations¶Import Library¶In [1]: import numpy as np Grid World¶In [2]: BOARD_ROWS = 3 # grid world 세로BOARD_COLS = 3 # grid world 가로GAMMA = 1.0POSSIBLE_ACTIONS = [0, 1, 2, 3] # 좌, 우, 상, 하ACTIONS = [(-1, 0), (1, 0), (0, -1), (0, 1)] # 좌표로 나타낸 행동REWARDS = [] Environment¶In [3]: class Env: def __init__(self): self.heig..
2024.09.10