240 lines
8.2 KiB
Markdown
240 lines
8.2 KiB
Markdown
# Game Design Document: **Labyrinth Conquest**
|
||
|
||
---
|
||
|
||
## 1. Concept Paragraph
|
||
|
||
**Game Concept:**
|
||
*Labyrinth Conquest* is a **turn-based, deterministic grid-navigation strategy game** for two players competing to retrieve a relic hidden within a shifting labyrinth. Each player commands an **Explorer**, represented by a marker on a square grid of tiles. The labyrinth contains walls, traps, and hazards that limit movement but are fully known to both players. Players alternate turns choosing actions to **Move**, **Rotate Tiles**, or **Activate Gadgets** in order to reach the central **Relic Tile** first. This design is **entirely original and unrelated to negotiation or trade-based gameplay**. The environment's challenge lies in spatial reasoning and path optimization.
|
||
|
||
---
|
||
|
||
## 2. Roles and Win Condition
|
||
|
||
**Roles:**
|
||
- **Player A** and **Player B** each control a distinct Explorer starting from opposite corners of the labyrinth.
|
||
- Both can observe the entire labyrinth state at all times.
|
||
|
||
**Win Condition:**
|
||
- The first player to move their Explorer onto the **Relic Tile** wins the game immediately (`winner = current_player`).
|
||
- If neither player reaches the relic after a fixed number of turns (e.g., 40), the winner is the player **closest (by Manhattan distance)** to the relic.
|
||
- If both are equidistant, the result is declared a **Draw**.
|
||
|
||
---
|
||
|
||
## 3. Turn Structure and Determinism
|
||
|
||
- Players alternate turns strictly: Player A → Player B → Player A → …
|
||
- Each turn consists of **one valid action**.
|
||
- Determinism is ensured by:
|
||
- Fixed grid layout and trap positions controlled by RNG seed.
|
||
- Any randomized initial layout generation uses the provided `seed` for exact reproducibility.
|
||
- Maximum turn limit: **40 turns per player** (80 total).
|
||
- Game ends immediately if a terminal condition is met.
|
||
|
||
---
|
||
|
||
## 4. Action Grammar (Machine-Parseable)
|
||
|
||
### Action Types:
|
||
Players may issue exactly one of the following tokens per turn, enclosed in `\boxed{{}}` during play.
|
||
|
||
---
|
||
|
||
#### 1. **[Move: <direction>]**
|
||
- Moves the player’s Explorer one tile in a cardinal direction if no wall blocks the path.
|
||
- `<direction>` ∈ {`N`, `S`, `E`, `W`}
|
||
|
||
**Regex:**
|
||
`^\[Move: (N|S|E|W)\]$`
|
||
|
||
**Example valid:** `[Move: N]`
|
||
**Example invalid:** `[Move: north]` → Invalid because lowercase direction not allowed.
|
||
|
||
---
|
||
|
||
#### 2. **[Rotate: <x>,<y>,<dir>]**
|
||
- Rotates a specified tile at coordinates `(x,y)` one quarter-turn clockwise or counterclockwise.
|
||
- `<dir>` ∈ {`CW`, `CCW`}
|
||
|
||
**Regex:**
|
||
`^\[Rotate: [0-9]+,[0-9]+,(CW|CCW)\]$`
|
||
|
||
**Example valid:** `[Rotate: 2,3,CW]`
|
||
**Example invalid:** `[Rotate: x2,3,CW]` → Invalid because coordinate must be numeric.
|
||
|
||
---
|
||
|
||
#### 3. **[Activate: <gadget>]**
|
||
- Triggers one of the special gadgets: opening traps or shifting a row.
|
||
- `<gadget>` ∈ {`Bridge`, `TrapDisarm`, `RowShift`}
|
||
|
||
**Regex:**
|
||
`^\[Activate: (Bridge|TrapDisarm|RowShift)\]$`
|
||
|
||
**Example valid:** `[Activate: Bridge]`
|
||
**Example invalid:** `[Activate: Fly]` → Invalid gadget keyword.
|
||
|
||
---
|
||
|
||
### Validation Notes:
|
||
Only one token per turn is permitted. Spacing, capitalization, and punctuation must **exactly** match these predefined grammars.
|
||
|
||
---
|
||
|
||
## 5. Game State Schema
|
||
|
||
```json
|
||
{
|
||
"grid_size": 5,
|
||
"tiles": [
|
||
["floor", "wall", "trap", "floor", "floor"],
|
||
["floor", "floor", "wall", "trap", "floor"],
|
||
["floor", "wall", "relic", "floor", "floor"],
|
||
["floor", "trap", "floor", "wall", "floor"],
|
||
["startA", "floor", "floor", "floor", "startB"]
|
||
],
|
||
"player_states": {
|
||
"A": {
|
||
"position": [0, 0],
|
||
"gadgets": ["Bridge", "TrapDisarm"],
|
||
"moves_taken": 5,
|
||
"distance_to_relic": 6
|
||
},
|
||
"B": {
|
||
"position": [4, 4],
|
||
"gadgets": ["RowShift"],
|
||
"moves_taken": 4,
|
||
"distance_to_relic": 8
|
||
}
|
||
},
|
||
"turn_number": 9,
|
||
"current_player": "A",
|
||
"seed": 42,
|
||
"action_history": [
|
||
"A: [Move: E]",
|
||
"B: [Rotate: 3,3,CW]",
|
||
"A: [Activate: Bridge]"
|
||
],
|
||
"winner": null,
|
||
"terminated": false,
|
||
"invalid_reason": null,
|
||
"observations": [
|
||
"Game begins. Players start in opposite corners.",
|
||
"A moved east.",
|
||
"B rotated tile (3,3) clockwise."
|
||
]
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 6. Initialization Rules
|
||
|
||
- A seeded RNG (`seed` input at `reset`) controls:
|
||
- Tile placement (`wall`, `trap`, `floor`, `relic`)
|
||
- Starting gadget distributions.
|
||
- Starting layout:
|
||
- `startA` at `(0,0)`, `startB` at `(grid_size-1, grid_size-1)`, `relic` at center.
|
||
- Each player begins with **2 random gadgets**.
|
||
- The first observation announces the initial labyrinth map and coordinates.
|
||
- No random movement during play ensures full determinism post-reset.
|
||
|
||
---
|
||
|
||
## 7. Validation and Error Handling
|
||
|
||
**Illegal Actions Detected If:**
|
||
- The unboxed action string does not match any defined regex pattern → `Reason: "Invalid action format"`
|
||
- The target coordinate `(x,y)` is outside the grid → `Reason: "Tile out of bounds"`
|
||
- Attempted movement blocked by a wall → `Reason: "Wall blocks path"`
|
||
- Gadget already used → `Reason: "Gadget unavailable"`
|
||
- Player issues multiple actions or malformed tokens → `Reason: "Multiple or malformed commands"`
|
||
|
||
When detected, the environment will call `set_invalid_move(player, reason)` and the opponent automatically wins unless `training_mode` allows retry.
|
||
|
||
---
|
||
|
||
## 8. Terminal Conditions and Scoring
|
||
|
||
**Terminal Checks Each Turn:**
|
||
1. If a player’s new position contains `"relic"`, `winner = current_player`.
|
||
2. If `turn_number >= max_turns`, compute `distance_to_relic` for both.
|
||
- Shorter distance → winner.
|
||
- Equal distance → `winner = null`, `draw = True`.
|
||
3. If an invalid move occurs, `winner = opponent`.
|
||
|
||
**Scoring:**
|
||
- `Winner`: +1 point
|
||
- `Loser`: 0 points
|
||
- `Draw`: both get 0.5 points
|
||
|
||
---
|
||
|
||
## 9. Player Prompt Specification
|
||
|
||
Each `_generate_player_prompt` presents the labyrinth, Explorer positions, remaining gadgets, turn count, and explicit action grammar.
|
||
|
||
**Prompt Outline:**
|
||
|
||
```
|
||
You are an Explorer navigating a shifting labyrinth.
|
||
Your goal is to reach the Relic Tile before your opponent by issuing one of the allowed commands.
|
||
|
||
Available actions (case-sensitive):
|
||
- [Move: N|S|E|W] — Move one tile in a direction if no wall blocks the way.
|
||
- [Rotate: x,y,CW|CCW] — Rotate the tile at coordinates (x,y).
|
||
- [Activate: Bridge|TrapDisarm|RowShift] — Use one of your gadgets (if available).
|
||
|
||
Current Turn: 9
|
||
You are Player A. Opponent is Player B.
|
||
Your position: (0,0)
|
||
Relic position: (2,2)
|
||
Available gadgets: Bridge, TrapDisarm
|
||
|
||
Respond with exactly one valid action token.
|
||
Put your final answer within \boxed{{}} at the end of your response.
|
||
|
||
Example valid response:
|
||
I will move north to progress toward the relic.
|
||
\boxed{{[Move: N]}}
|
||
|
||
Example invalid response:
|
||
\boxed{{Move north}} ← Invalid format; must include brackets and colon.
|
||
```
|
||
|
||
---
|
||
|
||
## 10. API Mapping Plan
|
||
|
||
### `reset(seed=None)`
|
||
- Creates a deterministic labyrinth with walls, traps, relic, and player starts.
|
||
- Initializes `game_state` following schema.
|
||
- Adds initial observations describing layout and objectives.
|
||
- Returns `obs` for both players.
|
||
|
||
### `step(player_id, action)`
|
||
- Extracts content using `_extract_answer_content`.
|
||
- Validates action format and feasibility.
|
||
- Updates positions, tile orientations, and available gadgets deterministically.
|
||
- Appends the action to `action_history` and `observations`.
|
||
- Checks terminal conditions; sets `terminated` and `winner` when satisfied.
|
||
- Returns updated observation and reward outcomes.
|
||
|
||
### `_generate_player_prompt(player_id)`
|
||
- Builds the full text prompt described above, tailored to the player’s view of current state.
|
||
- Queries `game_state` for position, gadgets, current turn, and visible grid.
|
||
- Appends example output section.
|
||
|
||
---
|
||
|
||
## 11. Copy-Check Against the Example
|
||
|
||
This design features a **completely unique environment**:
|
||
- **Theme:** Spatial navigation and puzzle solving (not negotiation or economy).
|
||
- **Terminology:** Explorers, relic, labyrinth, tiles, gadgets — none appear in the example.
|
||
- **Game mechanics:** Grid movement and tile transformation — unrelated to offers, deals, or trade.
|
||
- **State keys:** (`tiles`, `gadgets`, `relic`, `turn_number`, etc.) are original.
|
||
- **Prompt text** describes an exploration challenge, not an agreement or exchange.
|
||
|
||
Hence, *Labyrinth Conquest* satisfies the requirement to be a distinct, self-contained, deterministic, turn-based navigation environment. |