255 lines
8.1 KiB
Markdown
255 lines
8.1 KiB
Markdown
|
|
# Turn-Based TextArena Design Document: **"GlyphGrid Duel"**
|
|||
|
|
|
|||
|
|
*(Design document for a deterministic, turn-based environment inspired by tic-tac-toe mechanics, but set in a completely original setting, terminology, and data schema.)*
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 1. Concept Paragraph
|
|||
|
|
|
|||
|
|
**Game Title:** *GlyphGrid Duel*
|
|||
|
|
|
|||
|
|
In the ancient halls of the Archivists, two rival Scribes compete to inscribe mystical glyphs into a sacred 3×3 grid called the “Runeboard.” Each Scribe alternates turns to etch one of their signature glyphs—**Solar** (`S`) or **Lunar** (`L`)—into an empty rune slot. The goal is to align three of one’s glyphs consecutively across a row, column, or diagonal, representing mastery of the grid’s equilibrium energies. Although the core structure echoes a placement strategy game, *GlyphGrid Duel* is **unrelated to any negotiation, trade, or dialogue-based environment**. It focuses solely on deterministic pattern control, tactical foresight, and spatial reasoning.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 2. Roles and Win Condition
|
|||
|
|
|
|||
|
|
- **Players:** Two players:
|
|||
|
|
- **Scribe Solar** — uses the glyph `"S"`.
|
|||
|
|
- **Scribe Lunar** — uses the glyph `"L"`.
|
|||
|
|
|
|||
|
|
- **Objective:** Align three identical glyphs in a straight line across the 3×3 Runeboard.
|
|||
|
|
|
|||
|
|
- **Win Condition:**
|
|||
|
|
- A player wins immediately upon creating a line (horizontal, vertical, or diagonal) consisting of their own glyphs.
|
|||
|
|
- If all cells are filled and no player has a line, the result is a **Draw**.
|
|||
|
|
|
|||
|
|
- **Loss Condition:**
|
|||
|
|
- A player loses if the opponent achieves a winning alignment first.
|
|||
|
|
|
|||
|
|
- **Draw Condition:**
|
|||
|
|
- The Runeboard is full, and no completed line exists.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 3. Turn Structure and Determinism
|
|||
|
|
|
|||
|
|
- The game alternates turns between Scribe Solar (first) and Scribe Lunar (second).
|
|||
|
|
- Each turn consists of one valid placement action onto an empty cell.
|
|||
|
|
- **Turn Limit:** 9; the grid contains 9 total rune slots.
|
|||
|
|
- **Determinism:**
|
|||
|
|
- No random factors after initialization.
|
|||
|
|
- A fixed random seed controls any *starting player choice* (though Solar always starts by default) and can reproduce identical outcomes when applied in `reset(seed=x)`.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 4. Action Grammar (Machine-Parseable)
|
|||
|
|
|
|||
|
|
### Valid Actions
|
|||
|
|
|
|||
|
|
Each action specifies a cell position in row-column format, using **1-based indexing**.
|
|||
|
|
|
|||
|
|
**Format:**
|
|||
|
|
```
|
|||
|
|
[Etch: <row>, <column>]
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
- `<row>` and `<column>` are integers in `{1, 2, 3}`.
|
|||
|
|
- The cell at `(row, column)` must be unoccupied.
|
|||
|
|
|
|||
|
|
**Regex pattern:**
|
|||
|
|
```
|
|||
|
|
^\[Etch:\s*([1-3]),\s*([1-3])\]$
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Examples:
|
|||
|
|
|
|||
|
|
| Example Action | Valid? | Reason |
|
|||
|
|
|----------------|--------|--------|
|
|||
|
|
| `[Etch: 1, 3]` | ✅ | Valid coordinates. |
|
|||
|
|
| `[Etch: 3, 1]` | ✅ | Valid coordinates. |
|
|||
|
|
| `[Etch: 4, 2]` | ❌ | Row = 4 out of bounds. |
|
|||
|
|
| `[Etch (2,2)]` | ❌ | Invalid token format (missing colon and brackets). |
|
|||
|
|
| `[Mark: 1, 1]` | ❌ | Invalid verb token; must use “Etch”. |
|
|||
|
|
|
|||
|
|
All player responses must be wrapped in `\boxed{{}}` during gameplay, e.g.
|
|||
|
|
`\boxed{{[Etch: 2, 1]}}`.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 5. Game State Schema
|
|||
|
|
|
|||
|
|
Example `game_state` at runtime (illustrative values only):
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"runeboard": [
|
|||
|
|
["S", "L", "_"],
|
|||
|
|
[ "_", "S", "_"],
|
|||
|
|
["L", "_", "L"]
|
|||
|
|
],
|
|||
|
|
"current_player": "Solar",
|
|||
|
|
"turn_count": 5,
|
|||
|
|
"winner": null,
|
|||
|
|
"is_terminal": false,
|
|||
|
|
"last_action": "[Etch: 3, 3]",
|
|||
|
|
"observations": {
|
|||
|
|
"Solar": [
|
|||
|
|
"Runeboard state after turn 4...",
|
|||
|
|
"Lunar etched at (3,1)"
|
|||
|
|
],
|
|||
|
|
"Lunar": [
|
|||
|
|
"Runeboard state after turn 4...",
|
|||
|
|
"Lunar etched at (3,1)"
|
|||
|
|
]
|
|||
|
|
},
|
|||
|
|
"player_symbols": {
|
|||
|
|
"Solar": "S",
|
|||
|
|
"Lunar": "L"
|
|||
|
|
},
|
|||
|
|
"seed": 42
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Keys:
|
|||
|
|
- `runeboard` — Nested list of strings (`"S"`, `"L"`, or `"_"` for empty).
|
|||
|
|
- `current_player` — Indicates whose turn it is.
|
|||
|
|
- `turn_count` — Number of turns completed.
|
|||
|
|
- `winner` — `"Solar"`, `"Lunar"`, or `null`.
|
|||
|
|
- `is_terminal` — Boolean indicating game completion.
|
|||
|
|
- `last_action` — Last validated `[Etch: r, c]`.
|
|||
|
|
- `observations` — Per-player transcript and board updates.
|
|||
|
|
- `player_symbols` — Maps each player to their glyph.
|
|||
|
|
- `seed` — Ensures deterministic reproducibility.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 6. Initialization Rules
|
|||
|
|
|
|||
|
|
- On `reset(seed)`, seed is recorded in `game_state["seed"]`.
|
|||
|
|
- Starting player defaults to **Solar** unless a rule toggle changes it (seed-dependent optional).
|
|||
|
|
- Board resets to all empty (`"_"`).
|
|||
|
|
- `turn_count = 0`, `winner = null`, `is_terminal = false`.
|
|||
|
|
- Initial observation describes the empty Runeboard:
|
|||
|
|
```
|
|||
|
|
The Runeboard is empty. Each Scribe may etch a glyph using [Etch: row, col].
|
|||
|
|
```
|
|||
|
|
- All randomness (if ever expanded, e.g. random first player) must derive solely from the seed.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 7. Validation and Error Handling
|
|||
|
|
|
|||
|
|
When extracting the inner content via `_extract_answer_content`, the environment validates:
|
|||
|
|
|
|||
|
|
1. Regex pattern matches `^\[Etch:\s*([1-3]),\s*([1-3])\]$`.
|
|||
|
|
2. Target cell must be empty (`"_"`).
|
|||
|
|
3. Game must not be terminal.
|
|||
|
|
4. The acting player must match `current_player`.
|
|||
|
|
|
|||
|
|
**Invalid Move Reasons** (passed to `set_invalid_move`):
|
|||
|
|
- `"Invalid format: must be [Etch: row, column] with row,col in 1–3."`
|
|||
|
|
- `"Out of bounds: coordinates must be between 1 and 3."`
|
|||
|
|
- `"Cell already occupied."`
|
|||
|
|
- `"Game already ended."`
|
|||
|
|
- `"Not your turn."`
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 8. Terminal Conditions and Scoring
|
|||
|
|
|
|||
|
|
At the end of each valid move:
|
|||
|
|
|
|||
|
|
1. **Win Check:**
|
|||
|
|
- If current player’s glyph forms any contiguous row, column, or diagonal of 3 identical glyphs,
|
|||
|
|
→ `winner = current_player`, `is_terminal = True`.
|
|||
|
|
|
|||
|
|
2. **Draw Check:**
|
|||
|
|
- If `turn_count == 9` and `winner == null`,
|
|||
|
|
→ `is_terminal = True`, result = **Draw**.
|
|||
|
|
|
|||
|
|
3. **Scoring:**
|
|||
|
|
- **Win:** +1 point to winner; 0 to loser.
|
|||
|
|
- **Draw:** 0.5 to both.
|
|||
|
|
|
|||
|
|
4. **Tie Break:**
|
|||
|
|
- None; draws are final.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 9. Player Prompt Specification
|
|||
|
|
|
|||
|
|
Each turn’s `_generate_player_prompt(player_id)` provides:
|
|||
|
|
|
|||
|
|
1. **Identity Blurb:**
|
|||
|
|
```
|
|||
|
|
You are a Scribe competing to master the Runeboard through glyph alignment.
|
|||
|
|
```
|
|||
|
|
2. **Rules Summary:**
|
|||
|
|
- Each player alternately etches one glyph per turn.
|
|||
|
|
- Wins occur when three identical glyphs align (row, column, or diagonal).
|
|||
|
|
- If all nine cells are filled without alignment, it’s a draw.
|
|||
|
|
3. **Action Instructions:**
|
|||
|
|
- Choose one empty cell and etch your glyph.
|
|||
|
|
- Actions **must** follow the format `[Etch: row, column]`.
|
|||
|
|
- Place your final choice inside `\boxed{{}}`.
|
|||
|
|
|
|||
|
|
4. **Examples:**
|
|||
|
|
```
|
|||
|
|
Example valid response:
|
|||
|
|
I will etch at the top right corner.
|
|||
|
|
\boxed{{[Etch: 1, 3]}}
|
|||
|
|
|
|||
|
|
Example invalid response:
|
|||
|
|
\boxed{{[Mark: 1, 3]}} # Reason: "Mark" is not a valid action.
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
5. **Information Provided Each Turn:**
|
|||
|
|
- Current Runeboard state.
|
|||
|
|
- Move history (summarized from observations).
|
|||
|
|
- Which coordinates are still empty.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 10. API Mapping Plan
|
|||
|
|
|
|||
|
|
### **reset(seed)**
|
|||
|
|
|
|||
|
|
- Initializes `game_state` with all keys defined above.
|
|||
|
|
- Creates empty Runeboard and resets counters.
|
|||
|
|
- Sets seed for deterministic reproduction.
|
|||
|
|
- Returns initial observation for both players describing the empty grid.
|
|||
|
|
|
|||
|
|
### **step(player_id, action)**
|
|||
|
|
|
|||
|
|
- Extracts `content = _extract_answer_content(action)`.
|
|||
|
|
- Validates format and move legality.
|
|||
|
|
- Updates `runeboard`, `turn_count`, `current_player`.
|
|||
|
|
- Checks for terminal condition (win or draw).
|
|||
|
|
- Records action in both players’ `observations` list.
|
|||
|
|
- Returns updated `game_state`, per-player observations, reward signals, and termination status.
|
|||
|
|
|
|||
|
|
### **_generate_player_prompt(player_id)**
|
|||
|
|
|
|||
|
|
- Produces textual prompt combining:
|
|||
|
|
- Role context (Solar/Lunar)
|
|||
|
|
- Current Runeboard depiction
|
|||
|
|
- Legal moves list in `[Etch: r, c]` format
|
|||
|
|
- Reminder of boxed answer format and examples
|
|||
|
|
- Enforces output rule:
|
|||
|
|
`"Put your final answer within \boxed{{}} at the end of your response."`
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 11. Copy-Check Against the Example
|
|||
|
|
|
|||
|
|
This design:
|
|||
|
|
- **Does not** reference or replicate the negotiation example’s mechanics, dialogue, or resources.
|
|||
|
|
- Uses **completely distinct terminology**: Scribes, Glyphs, Runeboard, Etching.
|
|||
|
|
- Involves **no negotiation, trade, or communication** mechanics.
|
|||
|
|
- Defines an objective (line alignment) wholly original to this document.
|
|||
|
|
- All `game_state` keys (`runeboard`, `current_player`, `player_symbols`, etc.) and prompts are original to *GlyphGrid Duel*.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
**End of Design Document for “GlyphGrid Duel.”**
|