marlax.envs.GridWorld_r0#

class marlax.envs.GridWorld_r0(grid, n_agents, target_rewards, together_reward, travel_reward)[source]#

Regime 0: Single-step reward at center regardless of target labels.

__init__(grid, n_agents, target_rewards, together_reward, travel_reward)[source]#: Initialize regime-0 grid world (center-only reward).

Methods

`__init__`(grid, n_agents, target_rewards, ...)	Initialize regime-0 grid world (center-only reward).
`check_and_activate_rewards`()	Check if any agent is at the center and no reward target is active.
`check_mismatch`()	Detect if agents split between two correct target zones.
`check_wrong_reward_zones`()	Check if any agent enters a non-target reward zone.
`compute_rewards`(rewards)	Reward when any agent reaches the center cell.
`get_possible_states`()	Compute all possible next global states from current positions.
`get_state`()	Get the current global state representation.
`move_agents`(actions)	Update agent positions based on provided actions.
`reset`()	Randomly reposition agents and clear active rewards.
`step`(actions)	Execute one time step in the environment:

compute_rewards(rewards)[source]#

Reward when any agent reaches the center cell.

marlax.envs.GridWorld_r0