marlax.envs.GridWorld_r0#
- class marlax.envs.GridWorld_r0(grid, n_agents, target_rewards, together_reward, travel_reward)[source]#
Bases:
GridWorldRegime 0: Single-step reward at center regardless of target labels.
- __init__(grid, n_agents, target_rewards, together_reward, travel_reward)[source]#
Initialize regime-0 grid world (center-only reward).
Methods
__init__(grid, n_agents, target_rewards, ...)Initialize regime-0 grid world (center-only reward).
check_and_activate_rewards()Check if any agent is at the center and no reward target is active.
check_mismatch()Detect if agents split between two correct target zones.
check_wrong_reward_zones()Check if any agent enters a non-target reward zone.
compute_rewards(rewards)Reward when any agent reaches the center cell.
get_possible_states()Compute all possible next global states from current positions.
get_state()Get the current global state representation.
move_agents(actions)Update agent positions based on provided actions.
reset()Randomly reposition agents and clear active rewards.
step(actions)Execute one time step in the environment: