# Game Design Document: **Labyrinth Conquest** --- ## 1. Concept Paragraph **Game Concept:** *Labyrinth Conquest* is a **turn-based, deterministic grid-navigation strategy game** for two players competing to retrieve a relic hidden within a shifting labyrinth. Each player commands an **Explorer**, represented by a marker on a square grid of tiles. The labyrinth contains walls, traps, and hazards that limit movement but are fully known to both players. Players alternate turns choosing actions to **Move**, **Rotate Tiles**, or **Activate Gadgets** in order to reach the central **Relic Tile** first. This design is **entirely original and unrelated to negotiation or trade-based gameplay**. The environment's challenge lies in spatial reasoning and path optimization. --- ## 2. Roles and Win Condition **Roles:** - **Player A** and **Player B** each control a distinct Explorer starting from opposite corners of the labyrinth. - Both can observe the entire labyrinth state at all times. **Win Condition:** - The first player to move their Explorer onto the **Relic Tile** wins the game immediately (`winner = current_player`). - If neither player reaches the relic after a fixed number of turns (e.g., 40), the winner is the player **closest (by Manhattan distance)** to the relic. - If both are equidistant, the result is declared a **Draw**. --- ## 3. Turn Structure and Determinism - Players alternate turns strictly: Player A → Player B → Player A → … - Each turn consists of **one valid action**. - Determinism is ensured by: - Fixed grid layout and trap positions controlled by RNG seed. - Any randomized initial layout generation uses the provided `seed` for exact reproducibility. - Maximum turn limit: **40 turns per player** (80 total). - Game ends immediately if a terminal condition is met. --- ## 4. Action Grammar (Machine-Parseable) ### Action Types: Players may issue exactly one of the following tokens per turn, enclosed in `\boxed{{}}` during play. --- #### 1. **[Move: ]** - Moves the player’s Explorer one tile in a cardinal direction if no wall blocks the path. - `` ∈ {`N`, `S`, `E`, `W`} **Regex:** `^\[Move: (N|S|E|W)\]$` **Example valid:** `[Move: N]` **Example invalid:** `[Move: north]` → Invalid because lowercase direction not allowed. --- #### 2. **[Rotate: ,,]** - Rotates a specified tile at coordinates `(x,y)` one quarter-turn clockwise or counterclockwise. - `` ∈ {`CW`, `CCW`} **Regex:** `^\[Rotate: [0-9]+,[0-9]+,(CW|CCW)\]$` **Example valid:** `[Rotate: 2,3,CW]` **Example invalid:** `[Rotate: x2,3,CW]` → Invalid because coordinate must be numeric. --- #### 3. **[Activate: ]** - Triggers one of the special gadgets: opening traps or shifting a row. - `` ∈ {`Bridge`, `TrapDisarm`, `RowShift`} **Regex:** `^\[Activate: (Bridge|TrapDisarm|RowShift)\]$` **Example valid:** `[Activate: Bridge]` **Example invalid:** `[Activate: Fly]` → Invalid gadget keyword. --- ### Validation Notes: Only one token per turn is permitted. Spacing, capitalization, and punctuation must **exactly** match these predefined grammars. --- ## 5. Game State Schema ```json { "grid_size": 5, "tiles": [ ["floor", "wall", "trap", "floor", "floor"], ["floor", "floor", "wall", "trap", "floor"], ["floor", "wall", "relic", "floor", "floor"], ["floor", "trap", "floor", "wall", "floor"], ["startA", "floor", "floor", "floor", "startB"] ], "player_states": { "A": { "position": [0, 0], "gadgets": ["Bridge", "TrapDisarm"], "moves_taken": 5, "distance_to_relic": 6 }, "B": { "position": [4, 4], "gadgets": ["RowShift"], "moves_taken": 4, "distance_to_relic": 8 } }, "turn_number": 9, "current_player": "A", "seed": 42, "action_history": [ "A: [Move: E]", "B: [Rotate: 3,3,CW]", "A: [Activate: Bridge]" ], "winner": null, "terminated": false, "invalid_reason": null, "observations": [ "Game begins. Players start in opposite corners.", "A moved east.", "B rotated tile (3,3) clockwise." ] } ``` --- ## 6. Initialization Rules - A seeded RNG (`seed` input at `reset`) controls: - Tile placement (`wall`, `trap`, `floor`, `relic`) - Starting gadget distributions. - Starting layout: - `startA` at `(0,0)`, `startB` at `(grid_size-1, grid_size-1)`, `relic` at center. - Each player begins with **2 random gadgets**. - The first observation announces the initial labyrinth map and coordinates. - No random movement during play ensures full determinism post-reset. --- ## 7. Validation and Error Handling **Illegal Actions Detected If:** - The unboxed action string does not match any defined regex pattern → `Reason: "Invalid action format"` - The target coordinate `(x,y)` is outside the grid → `Reason: "Tile out of bounds"` - Attempted movement blocked by a wall → `Reason: "Wall blocks path"` - Gadget already used → `Reason: "Gadget unavailable"` - Player issues multiple actions or malformed tokens → `Reason: "Multiple or malformed commands"` When detected, the environment will call `set_invalid_move(player, reason)` and the opponent automatically wins unless `training_mode` allows retry. --- ## 8. Terminal Conditions and Scoring **Terminal Checks Each Turn:** 1. If a player’s new position contains `"relic"`, `winner = current_player`. 2. If `turn_number >= max_turns`, compute `distance_to_relic` for both. - Shorter distance → winner. - Equal distance → `winner = null`, `draw = True`. 3. If an invalid move occurs, `winner = opponent`. **Scoring:** - `Winner`: +1 point - `Loser`: 0 points - `Draw`: both get 0.5 points --- ## 9. Player Prompt Specification Each `_generate_player_prompt` presents the labyrinth, Explorer positions, remaining gadgets, turn count, and explicit action grammar. **Prompt Outline:** ``` You are an Explorer navigating a shifting labyrinth. Your goal is to reach the Relic Tile before your opponent by issuing one of the allowed commands. Available actions (case-sensitive): - [Move: N|S|E|W] — Move one tile in a direction if no wall blocks the way. - [Rotate: x,y,CW|CCW] — Rotate the tile at coordinates (x,y). - [Activate: Bridge|TrapDisarm|RowShift] — Use one of your gadgets (if available). Current Turn: 9 You are Player A. Opponent is Player B. Your position: (0,0) Relic position: (2,2) Available gadgets: Bridge, TrapDisarm Respond with exactly one valid action token. Put your final answer within \boxed{{}} at the end of your response. Example valid response: I will move north to progress toward the relic. \boxed{{[Move: N]}} Example invalid response: \boxed{{Move north}} ← Invalid format; must include brackets and colon. ``` --- ## 10. API Mapping Plan ### `reset(seed=None)` - Creates a deterministic labyrinth with walls, traps, relic, and player starts. - Initializes `game_state` following schema. - Adds initial observations describing layout and objectives. - Returns `obs` for both players. ### `step(player_id, action)` - Extracts content using `_extract_answer_content`. - Validates action format and feasibility. - Updates positions, tile orientations, and available gadgets deterministically. - Appends the action to `action_history` and `observations`. - Checks terminal conditions; sets `terminated` and `winner` when satisfied. - Returns updated observation and reward outcomes. ### `_generate_player_prompt(player_id)` - Builds the full text prompt described above, tailored to the player’s view of current state. - Queries `game_state` for position, gadgets, current turn, and visible grid. - Appends example output section. --- ## 11. Copy-Check Against the Example This design features a **completely unique environment**: - **Theme:** Spatial navigation and puzzle solving (not negotiation or economy). - **Terminology:** Explorers, relic, labyrinth, tiles, gadgets — none appear in the example. - **Game mechanics:** Grid movement and tile transformation — unrelated to offers, deals, or trade. - **State keys:** (`tiles`, `gadgets`, `relic`, `turn_number`, etc.) are original. - **Prompt text** describes an exploration challenge, not an agreement or exchange. Hence, *Labyrinth Conquest* satisfies the requirement to be a distinct, self-contained, deterministic, turn-based navigation environment.