Files
mazerunner-v0/environment.md

230 lines
7.4 KiB
Markdown
Raw Normal View History

# **Game Design Document: “Crystal Grid”**
---
## 1. Concept Paragraph
“**Crystal Grid**” is a deterministic, turn-based strategy duel in which two mystic architects compete to align charged crystals on a sacred 3×3 energy grid. Each player channels a unique elemental charge—Solar (Player 1) or Lunar (Player 2)—and infuses crystals into empty nodes. The goal is to align three of ones crystals horizontally, vertically, or diagonally to create a stable energy conduit. The available actions consist solely of placing a crystal at a chosen grid coordinate. This design is entirely unrelated to any negotiation or trading game; its a pure positional alignment challenge grounded in an original magical circuit theme.
---
## 2. Roles and Win Condition
- **Players**:
- Player 1: **Solar Architect** (mark → `S`)
- Player 2: **Lunar Architect** (mark → `L`)
- **Objective**: Align three of ones crystals (`S` or `L`) in a row, column, or diagonal.
- **Win Condition**:
- A player **wins** if after their turn, three of their marks form an unbroken line.
- The opposing player **loses** immediately.
- If all nine nodes are filled and no line is formed, the outcome is a **draw**.
Decision rules:
1. Check alignment after every placement.
2. If both align simultaneously (impossible under alternation), the current movers alignment takes precedence.
3. Draw declared if maximum turns (9) are reached without a win.
---
## 3. Turn Structure and Determinism
- Players alternate strictly (`Solar → Lunar → Solar → ...`).
- The first mover (Solar) always begins.
- The environment is **fully deterministic**.
- Random seeding affects only which player may go first if configured optionally; otherwise fixed for reproducibility via explicit seed.
- Turn limit: **9** turns maximum.
- All outcomes reproducible with identical initial seed and move sequence.
---
## 4. Action Grammar (Machine-Parseable)
**Allowed Action Tokens**:
Each player may perform:
```
[Place: <row>,<col>]
```
where:
- `<row>` and `<col>``{1,2,3}`
- Positions are 1-indexed (top-left is 1,1; bottom-right is 3,3)
- No repetition; cannot place where a crystal already exists.
**Formal Pattern (regex)**:
```
^\[Place:\s*([1-3]),\s*([1-3])\]$
```
**Example valid action**:
```
[Place: 2,3]
```
Reason valid: Correct token keyword, valid 13 coordinates, comma-separated.
**Example invalid action #1**:
```
[Place: 0,3]
```
Reason invalid: Row out of allowed range (must be 13).
**Example invalid action #2**:
```
[Play: 2,3]
```
Reason invalid: Token keyword incorrect (Play not recognized).
In actual play, players will respond with `\boxed{{[Place: 2,3]}}` formatting, but the validation operates on the unboxed content.
---
## 5. Game State Schema
Example runtime structure:
```json
{
"turn_count": 4,
"current_player": "Solar",
"grid": [
["S", "L", null],
[null, "S", null],
[null, null, "L"]
],
"available_cells": [[1,3],[2,1],[2,3],[3,1],[3,2]],
"winner": null,
"is_terminal": false,
"observations": {
"Solar": "Previous move: [Place: 1,2] by Lunar.",
"Lunar": "Your opponent placed [Place: 1,2]."
},
"history": [
"Solar → [Place: 1,1]",
"Lunar → [Place: 1,2]",
"Solar → [Place: 2,2]",
"Lunar → [Place: 3,3]"
],
"seed": 42,
"score": {
"Solar": 0,
"Lunar": 0
}
}
```
---
## 6. Initialization Rules
- The grid initializes to all `null` (empty).
- `current_player` set to `"Solar"` unless overridden by seed-based first mover rule.
- `winner` set to `null`, `turn_count = 0`.
- The `seed` controls any shuffle variation for first turn but not the grid generation.
- Initial observations describe the empty grid and player symbols:
```
The Crystal Grid is empty. You are Solar Architect (symbol S).
Your charge begins first.
```
---
## 7. Validation and Error Handling
**Invalid move triggers**:
1. **Pattern mismatch** — doesnt match regex → “Action format not recognized.”
2. **Cell out of range** — row/col outside 13 → “Coordinates must be between 1 and 3.”
3. **Cell occupied** — attempted placement on non-null grid cell → “That node already holds a crystal.”
4. **Not your turn** — action attempted by the wrong player → “It is not your turn.”
When detected, environment returns `set_invalid_move(player_id, reason)`.
---
## 8. Terminal Conditions and Scoring
**Check each turn**:
1. After placement, verify if that player occupies any of the 8 winning lines:
- Rows: (1,2,3)*(each same row)
- Columns: (1,2,3)*(each same column)
- Diagonals: `(1,1)-(2,2)-(3,3)` and `(1,3)-(2,2)-(3,1)`
2. If true → `is_terminal = True`, `winner = current_player`.
3. If `turn_count == 9` and no winner → `is_terminal = True`, `winner = "draw"`.
**Scoring convention**:
- Win → +1 score for winner, 0 for loser.
- Draw → 0.5 each.
**Tie-break**:
- None required beyond draw detection.
---
## 9. Player Prompt Specification
When `_generate_player_prompt` runs, it assembles a structured text like below:
**Prompt Outline:**
- Identity blurb:
“You are a mystic architect competing on the Crystal Grid. Align three of your charged crystals before your opponent does.”
- Display current board in a human-readable format with coordinates.
- List allowed actions summary:
```
Allowed Action: [Place: row,col]
where row and col are integers in {1,2,3}.
```
- Explain win conditions and turn rules.
- Example response instructions, including final boxed action.
**Example snippets shown in prompt:**
```
Example valid response:
I will charge the central node for structural balance.
\boxed{{[Place: 2,2]}}
```
If an invalid attempt occurs:
```
Invalid example (do not use):
\boxed{{[Play: 2,2]}} <-- token must be [Place: ...]
```
**Response format reminder**:
“At the end of your message, put your final answer within \boxed{{}} using one allowed action.”
---
## 10. API Mapping Plan
**reset()**
- Clears grid and logs, initializes player identifiers and seed.
- Returns initial `game_state`, first `observations`, and references to legal actions.
- No winner or score yet.
**step(player_action)**
- Extracts action via `_extract_answer_content()`.
- Validates syntax and legality.
- If valid: updates grid, increments turn count, checks for winner/draw.
- Updates `observations` to include transcript of both players actions.
- Returns updated `game_state`, new observation for next player, and completion flag.
**_generate_player_prompt(player_id)**
- Constructs full textual prompt described above with player perspective.
- Reads `game_state`: grid, turn count, player symbols, etc.
- Inserts few-shot examples of valid/invalid moves.
- Ends with explicit directive to produce `\boxed{{[Place: <row>,<col>]}}`.
---
## 11. Copy-Check Against the Example
All entities, vocabulary, and schema in “**Crystal Grid**” are fully original:
- Theme: mystical crystal alignment, not trading, negotiation, or pricing.
- Roles: Solar and Lunar Architects, not merchants or diplomats.
- Objective: spatial pattern alignment, not bargaining success.
- Keys (`grid`, `available_cells`, `seed`, `history`, etc.) are unique to this design.
- Actions: `[Place: row,col]` tokens, entirely distinct from any negotiation-style `[Offer]` or `[Accept]`.
This confirms full conceptual independence from the previous example while preserving structure needed for deterministic, turn-based API implementation.