Files
mazerunner-v0/environment.md
2001-01-01 00:00:00 +00:00

230 lines
7.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# **Game Design Document: “Crystal Grid”**
---
## 1. Concept Paragraph
“**Crystal Grid**” is a deterministic, turn-based strategy duel in which two mystic architects compete to align charged crystals on a sacred 3×3 energy grid. Each player channels a unique elemental charge—Solar (Player 1) or Lunar (Player 2)—and infuses crystals into empty nodes. The goal is to align three of ones crystals horizontally, vertically, or diagonally to create a stable energy conduit. The available actions consist solely of placing a crystal at a chosen grid coordinate. This design is entirely unrelated to any negotiation or trading game; its a pure positional alignment challenge grounded in an original magical circuit theme.
---
## 2. Roles and Win Condition
- **Players**:
- Player 1: **Solar Architect** (mark → `S`)
- Player 2: **Lunar Architect** (mark → `L`)
- **Objective**: Align three of ones crystals (`S` or `L`) in a row, column, or diagonal.
- **Win Condition**:
- A player **wins** if after their turn, three of their marks form an unbroken line.
- The opposing player **loses** immediately.
- If all nine nodes are filled and no line is formed, the outcome is a **draw**.
Decision rules:
1. Check alignment after every placement.
2. If both align simultaneously (impossible under alternation), the current movers alignment takes precedence.
3. Draw declared if maximum turns (9) are reached without a win.
---
## 3. Turn Structure and Determinism
- Players alternate strictly (`Solar → Lunar → Solar → ...`).
- The first mover (Solar) always begins.
- The environment is **fully deterministic**.
- Random seeding affects only which player may go first if configured optionally; otherwise fixed for reproducibility via explicit seed.
- Turn limit: **9** turns maximum.
- All outcomes reproducible with identical initial seed and move sequence.
---
## 4. Action Grammar (Machine-Parseable)
**Allowed Action Tokens**:
Each player may perform:
```
[Place: <row>,<col>]
```
where:
- `<row>` and `<col>``{1,2,3}`
- Positions are 1-indexed (top-left is 1,1; bottom-right is 3,3)
- No repetition; cannot place where a crystal already exists.
**Formal Pattern (regex)**:
```
^\[Place:\s*([1-3]),\s*([1-3])\]$
```
**Example valid action**:
```
[Place: 2,3]
```
Reason valid: Correct token keyword, valid 13 coordinates, comma-separated.
**Example invalid action #1**:
```
[Place: 0,3]
```
Reason invalid: Row out of allowed range (must be 13).
**Example invalid action #2**:
```
[Play: 2,3]
```
Reason invalid: Token keyword incorrect (Play not recognized).
In actual play, players will respond with `\boxed{{[Place: 2,3]}}` formatting, but the validation operates on the unboxed content.
---
## 5. Game State Schema
Example runtime structure:
```json
{
"turn_count": 4,
"current_player": "Solar",
"grid": [
["S", "L", null],
[null, "S", null],
[null, null, "L"]
],
"available_cells": [[1,3],[2,1],[2,3],[3,1],[3,2]],
"winner": null,
"is_terminal": false,
"observations": {
"Solar": "Previous move: [Place: 1,2] by Lunar.",
"Lunar": "Your opponent placed [Place: 1,2]."
},
"history": [
"Solar → [Place: 1,1]",
"Lunar → [Place: 1,2]",
"Solar → [Place: 2,2]",
"Lunar → [Place: 3,3]"
],
"seed": 42,
"score": {
"Solar": 0,
"Lunar": 0
}
}
```
---
## 6. Initialization Rules
- The grid initializes to all `null` (empty).
- `current_player` set to `"Solar"` unless overridden by seed-based first mover rule.
- `winner` set to `null`, `turn_count = 0`.
- The `seed` controls any shuffle variation for first turn but not the grid generation.
- Initial observations describe the empty grid and player symbols:
```
The Crystal Grid is empty. You are Solar Architect (symbol S).
Your charge begins first.
```
---
## 7. Validation and Error Handling
**Invalid move triggers**:
1. **Pattern mismatch** — doesnt match regex → “Action format not recognized.”
2. **Cell out of range** — row/col outside 13 → “Coordinates must be between 1 and 3.”
3. **Cell occupied** — attempted placement on non-null grid cell → “That node already holds a crystal.”
4. **Not your turn** — action attempted by the wrong player → “It is not your turn.”
When detected, environment returns `set_invalid_move(player_id, reason)`.
---
## 8. Terminal Conditions and Scoring
**Check each turn**:
1. After placement, verify if that player occupies any of the 8 winning lines:
- Rows: (1,2,3)*(each same row)
- Columns: (1,2,3)*(each same column)
- Diagonals: `(1,1)-(2,2)-(3,3)` and `(1,3)-(2,2)-(3,1)`
2. If true → `is_terminal = True`, `winner = current_player`.
3. If `turn_count == 9` and no winner → `is_terminal = True`, `winner = "draw"`.
**Scoring convention**:
- Win → +1 score for winner, 0 for loser.
- Draw → 0.5 each.
**Tie-break**:
- None required beyond draw detection.
---
## 9. Player Prompt Specification
When `_generate_player_prompt` runs, it assembles a structured text like below:
**Prompt Outline:**
- Identity blurb:
“You are a mystic architect competing on the Crystal Grid. Align three of your charged crystals before your opponent does.”
- Display current board in a human-readable format with coordinates.
- List allowed actions summary:
```
Allowed Action: [Place: row,col]
where row and col are integers in {1,2,3}.
```
- Explain win conditions and turn rules.
- Example response instructions, including final boxed action.
**Example snippets shown in prompt:**
```
Example valid response:
I will charge the central node for structural balance.
\boxed{{[Place: 2,2]}}
```
If an invalid attempt occurs:
```
Invalid example (do not use):
\boxed{{[Play: 2,2]}} <-- token must be [Place: ...]
```
**Response format reminder**:
“At the end of your message, put your final answer within \boxed{{}} using one allowed action.”
---
## 10. API Mapping Plan
**reset()**
- Clears grid and logs, initializes player identifiers and seed.
- Returns initial `game_state`, first `observations`, and references to legal actions.
- No winner or score yet.
**step(player_action)**
- Extracts action via `_extract_answer_content()`.
- Validates syntax and legality.
- If valid: updates grid, increments turn count, checks for winner/draw.
- Updates `observations` to include transcript of both players actions.
- Returns updated `game_state`, new observation for next player, and completion flag.
**_generate_player_prompt(player_id)**
- Constructs full textual prompt described above with player perspective.
- Reads `game_state`: grid, turn count, player symbols, etc.
- Inserts few-shot examples of valid/invalid moves.
- Ends with explicit directive to produce `\boxed{{[Place: <row>,<col>]}}`.
---
## 11. Copy-Check Against the Example
All entities, vocabulary, and schema in “**Crystal Grid**” are fully original:
- Theme: mystical crystal alignment, not trading, negotiation, or pricing.
- Roles: Solar and Lunar Architects, not merchants or diplomats.
- Objective: spatial pattern alignment, not bargaining success.
- Keys (`grid`, `available_cells`, `seed`, `history`, etc.) are unique to this design.
- Actions: `[Place: row,col]` tokens, entirely distinct from any negotiation-style `[Offer]` or `[Accept]`.
This confirms full conceptual independence from the previous example while preserving structure needed for deterministic, turn-based API implementation.