# GAME DESIGN DOCUMENT — **"StarGrid Duel"** --- ### 1. Concept Paragraph **StarGrid Duel** is a deterministic, turn-based strategy game inspired by the simplicity of grid conquest, but it is **not** tic‑tac‑toe. Two rival star‑navigators take turns deploying *energy beacons* on a 3×3 stellar grid. Their aim is to align three of their own beacons in a straight line of cosmic power (horizontal, vertical, or diagonal) before the opponent does, or to fill the grid entirely for a balanced standoff. Players will issue commands like `[Place: A2]` to deposit a beacon on a coordinate. The environment is purely deterministic: no randomness or negotiation mechanics are involved. The game’s purpose is to measure spatial foresight and terminal pattern recognition—completely unrelated to any negotiation or resource trading examples. --- ### 2. Roles and Win Condition - **Roles** - **Player A ("Navigator Alpha")**: Uses energy color **Blue**. - **Player B ("Navigator Beta")**: Uses energy color **Crimson**. - **Objective** Be the first navigator to align three of your beacons continuously (row, column, or diagonal) on the 3×3 StarGrid. - **Win Rule** - A player **wins** immediately upon forming a line of three of their own symbols. - The game is a **draw** if all nine cells are filled without a three‑in‑a‑line configuration. - Upon win or draw, the game enters a terminal state and no further actions are accepted. --- ### 3. Turn Structure and Determinism - Players alternate turns beginning with Player A at turn index `0`. - Each turn is atomic: exactly one action is taken. - A deterministic seed ensures that initialization and any potential random ordering (none required here, but included for reproducibility) follow identical patterns. - The turn counter increments after each valid action. Once nine valid turns have been processed or a win condition is met, the environment halts. --- ### 4. Action Grammar (Machine‑Parsable) Players specify grid placement commands targeting one unused cell. **Allowed Actions** ``` [Place: ] ``` **Cell IDs** Valid values: `A1, A2, A3, B1, B2, B3, C1, C2, C3` (Rows A–C, Columns 1–3) **Formal Pattern (Regex)** `^\[Place:\s*(A|B|C)(1|2|3)\]$` **Examples** - **Valid:** `[Place: B2]` → Places player’s beacon in the center cell. - **Invalid Examples:** - `[place: B2]` → Invalid capitalization and token name. - `[Place: D1]` → `D1` not in allowed grid range. - `[Deploy: A1]` → Invalid action token. - `[Place: B2 extra]` → Extra text violates strict grammar. All player outputs later will be wrapped in `\boxed{{…}}`. The implementation will extract the internal `[Place: X#]` command to validate according to the above pattern. --- ### 5. Game State Schema ```json { "turn_index": 4, "active_player": "B", "board": { "A1": "Blue", "A2": null, "A3": "Crimson", "B1": "Blue", "B2": "Crimson", "B3": null, "C1": null, "C2": null, "C3": null }, "player_symbols": { "A": "Blue", "B": "Crimson" }, "move_history": [ {"player": "A", "action": "[Place: A1]"}, {"player": "B", "action": "[Place: A3]"}, {"player": "A", "action": "[Place: B1]"}, {"player": "B", "action": "[Place: B2]"} ], "winner": null, "is_draw": false, "observations": { "A": "Text transcript of latest game state for Alpha", "B": "Text transcript of latest game state for Beta" }, "seed": 42 } ``` --- ### 6. Initialization Rules - `reset(seed)` initializes an empty 3×3 board with all cells `null`. - The turn index resets to `0` with `active_player = "A"`. - The same seed always ensures that turn order, board labeling, and any deterministic tie logic behave identically. - Both players receive an onboarding observation describing: - Empty StarGrid layout - Their color and symbol - Instructions and the legal action syntax --- ### 7. Validation and Error Handling - Upon receiving a player move, extract the content inside `\boxed{{}}` using `_extract_answer_content`. - Validate against the regex `^\[Place:\s*(A|B|C)(1|2|3)\]$`. - Check that the specified cell is unoccupied. - **Invalid Move Reasons** - `"MalformedAction"`: Does not match required pattern. - `"CellOutOfRange"`: Coordinate not part of StarGrid labels. - `"CellOccupied"`: Target cell already taken. - `"NotYourTurn"`: Attempt to act out of sequence after loss or between turns. The environment calls `set_invalid_move(reason)` with a human-readable reason, retaining determinism (the turn is forfeited or handled as draw according to policy). --- ### 8. Terminal Conditions and Scoring **Checks each turn immediately after placing a valid beacon:** 1. **Victory Check** – If the current player’s beacons form any of the eight winning line patterns, set `winner = active_player`, terminate game. 2. **Draw Check** – If no empty cells remain and no winner exists, set `is_draw = true`. 3. **Scoring** – - Win: `+1` score for winner, `0` for loser. - Draw: `0.5` each as tie credit (for potential series mode). **Tie‑Break Procedure** If multiple win conditions appear simultaneously (impossible under normal rules), the first detected alignment pattern is applied deterministically. --- ### 9. Player Prompt Specification Each player receives a structured prompt reflecting the current board and legal moves. **Prompt Outline** > **Identity Blurb:** > You are a star navigator placing energy beacons on a galactic grid. Each cell you claim radiates your color’s energy. The goal is to align three of your beacons in a line before the opponent. > **Current Board State:** > - Display a 3×3 grid with coordinates and current occupancy. > **Your Color:** Blue or Crimson > **Turn Information:** Which player moves next (`Navigator Alpha` or `Navigator Beta`) > **Allowed Actions:** > Format: `[Place: ]`, where `` ∈ {A1,…,C3} and the cell must be empty. > You must wrap your selected action inside `\boxed{{}}` at the end of your message. > **Response Format:** > You may reason about your move, then output your final choice within `\boxed{{}}`. **Few‑Shot Examples** ``` Example valid response: I will claim the center of the grid to control diagonals. \boxed{{[Place: B2]}} Example invalid response: I think I'll move now. \boxed{{[Move: B2]}} ← "Move" not a valid token. ``` The function `_extract_answer_content(self, action: str) -> str` will remove `\boxed{{}}` wrappers and yield `[Place: X#]` for validation. --- ### 10. API Mapping Plan - **`reset(seed)`** - Sets initial empty `board`, `turn_index=0`, and seeds RNG for determinism. - Returns initial observations (`"Navigator Alpha"`, `"Navigator Beta"`). - **`step(player_action)`** - Extracts action token with `_extract_answer_content`. - Validates syntax and target cell availability. - Updates `board`, appends to `move_history`, increments `turn_index`. - After update, executes terminal checks (victory or draw). - Produces new observations describing updated board state. - **`_generate_player_prompt(player_id)`** - Compiles textual description of board, current scores, and open cells. - Lists permitted `[Place: ]` choices. - Concludes with directive: *Put your final answer within \boxed{{}} at the end of your response.* All actions and resultant board states are deterministic given identical seeds and action sequences. --- ### 11. Copy‑Check Against the Example - The environment, terminology, and objective are **entirely original**. - There is **no negotiation**, **no trading**, **no resource exchange**, and **no alignment with any bargaining mechanics** from the example environment. - Entities (“Navigator Alpha/Beta,” “energy beacons,” “StarGrid”) and game state keys (`board`, `player_symbols`, `move_history`, etc.) are unique to this design. - The theme is cosmic grid conquest, **not** any prior example domain. --- **End of Design Document – “StarGrid Duel”**