Files
testtest3/environment.md

7.5 KiB
Raw Permalink Blame History

Game Design Document: “Labyrinth Command”


1. Concept Paragraph

“Labyrinth Command” is a deterministic, turn-based two-player tactical maze exploration game. Two rival explorers are trapped inside a grid-shaped labyrinth and must reach the Central Beacon at the mazes heart before their opponent. Each turn, players issue one command from a fixed grammar of movement and interaction tokens (e.g., [Move:North], [Scan], [Wait]). The maze layout, beacon position, and obstacles are generated deterministically from a single seed, ensuring reproducibility. The game is not related to any economic, negotiation, or resource-trading example—its theme focuses purely on spatial logic and exploration within a confined environment.


2. Roles and Win Condition

Roles

  • Explorer A and Explorer B are rival adventurers in identical labyrinth conditions.
  • Both start at distinct, opposite corners of the maze.

Objectives

  • Reach the Central Beacon Cell (B) before the opponent.
  • A secondary scoring system tracks proximity to the Beacon at game end if neither player reaches it within the turn limit.

Win Rule

  1. A player wins immediately if they enter the Beacon cell first.
  2. If both reach simultaneously on the same turn: Draw.
  3. If turn limit expires with no beacon reached: player closer (Manhattan distance) to the Beacon wins.
  4. If both are equally distant: Draw.

3. Turn Structure and Determinism

  • The game proceeds in alternating turns, starting with Explorer A.
  • Each turn = one player action followed by environment update and opponent observation.
  • Turn limit: 20 turns per player (40 total).
  • Maze generation and beacon placement use a seed value set at reset, guaranteeing fully deterministic structure and outcomes for identical seeds.
  • All elements of randomness (e.g., obstacle positions) derive from this same seed.

4. Action Grammar (Machine-Parseable)

Allowed Action Tokens (case-sensitive):

Token Pattern Meaning
[Move:Direction] Move one cell in a cardinal direction (North, South, East, West) if not blocked.
[Scan] Reveal contents of adjacent cells to update the players visible map.
[Wait] Skip the move, useful for strategic timing.

Formal Patterns (Regex-style):

  1. ^\\[Move:(North|South|East|West)\\]$
  2. ^\\[Scan\\]$
  3. ^\\[Wait\\]$

Examples

Action Validity Explanation
[Move:North] Valid Matches move pattern
[Scan] Valid Matches scan pattern
[Wait] Valid Matches wait pattern
[Move:Northeast] Invalid Direction not allowed
[move:North] Invalid Case-sensitive mismatch
[Attack] Invalid Unsupported token

5. Game State Schema

{
  "seed": 18457,
  "turn_index": 6,
  "max_turns": 40,
  "maze_width": 7,
  "maze_height": 7,
  "beacon_position": [3, 3],
  "cells_blocked": [[0,1],[2,2],[4,5]],
  "player_states": {
    "A": {
      "position": [0,0],
      "visible_map": [["?", "X", "?", "?"],["?", ".", ".", "?"],["?", "?", ".", "?"]],
      "visited_cells": [[0,0],[1,0]],
      "last_action": "[Move:South]"
    },
    "B": {
      "position": [6,6],
      "visible_map": [["?", ".", "?"],[".", ".", "?"],["?", "?", "?"]],
      "visited_cells": [[6,6]],
      "last_action": "[Scan]"
    }
  },
  "transcript": [
    {"player":"A", "action":"[Move:South]"},
    {"player":"B", "action":"[Scan]"}
  ],
  "winner": null,
  "terminated": false
}

6. Initialization Rules

  • Maze layout generated through seeded deterministic algorithm (seed provided or auto-generated).
  • Both players placed:
    • Explorer A → top-left corner [0,0]
    • Explorer B → bottom-right corner [width-1,height-1]
  • Beacon placed at center (width//2, height//2).
  • visible_map initialized with limited visibility: only 3×3 region around player marked or unknown.
  • At reset, each player receives:
    • Maze dimensions
    • Starting coordinates
    • Number of turns and win condition summary

7. Validation and Error Handling

Invalid Move Detection Rules

  • Action not matching one of the defined regex patterns → Invalid token format
  • Action would move explorer outside maze bounds → Move out of bounds
  • Action would move explorer into blocked cell → Cell blocked
  • Any attempt made after terminal state → Game already finished

System calls set_invalid_move(player, reason) upon detection.


8. Terminal Conditions and Scoring

Terminal Triggers

  1. Player enters the Beacon cell → Win for that player.
  2. Both reach Beacon simultaneously → Draw.
  3. Turn limit reached → Compare distance to Beacon.
    • Smaller Manhattan distance → Win.
    • Equal → Draw.

Scoring Computation

  • Winner gets 1, loser 0, draw 0.5.
  • Stored in winner key as "A", "B", or "Draw".

9. Player Prompt Specification

Prompt Content Outline

  • Game title and theme summary
  • Players identity (Explorer A or B)
  • Current turn number and limits
  • Players current position, visible map grid, and last known opponent action
  • List of allowable command formats
  • Reminder to place final command inside \boxed{{}}
  • Examples of valid vs invalid formatting

Prompt Example

You are Explorer A navigating the labyrinth. Your goal is to reach the Central Beacon before your rival. 
You can issue ONE command per turn using the following grammar:

[Move:North] | [Move:South] | [Move:East] | [Move:West] | [Scan] | [Wait]

Remember: 
- Moving into blocked walls or out of bounds is invalid.
- The beacon lies at the labyrinths center.
- You must wrap your command inside \\boxed{{}}.

Example valid response:
I want to go north to advance toward the beacon.
\boxed{{[Move:North]}}

Example invalid response:
Lets head northeast.        ← invalid direction keyword

Now it is your turn. Choose your next command carefully.
Put your final answer within \\boxed{{}} at the end of your response.

Helper: _extract_answer_content(self, action: str) -> str
Extracts the content enclosed by \boxed{{...}} for validation and execution.


10. API Mapping Plan

reset()

  • Generate deterministic maze grid based on seed.
  • Initialize all fields of game_state per schema.
  • Return initial observation for each player, including map visibility and rules summary.

step(player_action)

  • Use _extract_answer_content to unwrap the boxed token.
  • Validate with grammar and state constraints.
  • If invalid → call set_invalid_move.
  • If valid → mutate player position/visibility, append to transcript.
  • Perform terminal condition checks after each move; update winner and terminated appropriately.
  • Return resulting state observation and game status.

_generate_player_prompt(player_id)

  • Construct text prompt per section 9.
  • Include available moves, last opponent move, remaining turns, and map details.
  • Append "Put your final answer within \boxed{{}} at the end of your response."

11. Copy-Check Against the Example

  • The Labyrinth Command game has an exploration and spatial logic theme, not negotiation, trade, or economy-related.
  • All entities—maze, beacon, blocked cells, and explorers—are original constructs.
  • Action tokens [Move:…], [Scan], [Wait], and state keys (beacon_position, cells_blocked, visible_map) are unique to this design.
  • No resource exchanges, offers, or bargaining are present.

End of Design Document “Labyrinth Command”