Add environment documentation from Openverse builder

This commit is contained in:
Openverse Builder
2001-01-01 00:00:00 +00:00
parent 34dd146b6e
commit 6ab12996e8

202
environment.md Normal file
View File

@@ -0,0 +1,202 @@
# TextArena Game Design Document: **GlyphGrid Duel**
---
## 1. Concept Paragraph
**GlyphGrid Duel** is a deterministic, turn-based logic match for two players set in an abstract digital arena. The environment is themed around ancient digital runes rather than trade or negotiation—completely distinct from any example scenario. Players alternately inscribe their unique glyphs (`X` or `O`) onto a 3×3 grid, seeking to align three of their symbols either vertically, horizontally, or diagonally. Each inscription is expressed by a formatted action token that specifies a grid coordinate. The games core tokens are `[Inscribe:x,y]`, where `x` and `y` denote positions on the grid (13). The deterministic outcome is based solely on player choices—no randomness beyond an initial seed to fix who starts first.
---
## 2. Roles and Win Condition
- **Players**:
- Player 1 = **Glyph X**
- Player 2 = **Glyph O**
- **Objective**: Align three identical glyphs (your symbol) in a straight line—across a row, column, or diagonal.
- **Winning Rule**:
A player **wins** immediately upon placing a glyph that results in three consecutive identical symbols along any row, column, or diagonal.
- **Losing Rule**:
The opponent loses when the other achieves alignment.
- **Draw Rule**:
If the grid is fully filled (9 moves) without any alignment, the game ends in a draw.
---
## 3. Turn Structure and Determinism
- **Turn Order**: Alternate turns starting from Player 1 (or seeded random assignment based on a reproducible seed).
- **One Action per Turn**: Each turn, the active player must choose one vacant coordinate and inscribe their glyph.
- **Turn Limit**: Maximum 9 turns (since there are 9 cells).
- **Determinism**: Game progression depends solely on player actions. The random seed is used only to determine the initial starting player; once seeded, outcomes are fully reproducible.
---
## 4. Action Grammar (Machine-Parseable)
Each action is enclosed in `<answer></answer>` tags, but the content parsed will be one of the following formats:
### Allowed Token Pattern
```
[Inscribe:x,y]
```
- **x** = integer from 1 to 3 inclusive
- **y** = integer from 1 to 3 inclusive
### Regular Expression
```
^\[Inscribe:(1|2|3),(1|2|3)\]$
```
### Examples
| Type | Example | Explanation |
|------|----------|-------------|
| ✅ Valid | `[Inscribe:2,3]` | Inscribes the players glyph at row 2, column 3. |
| ❌ Invalid | `[inscribe:2,3]` | Case-sensitive; “Inscribe” must be capitalized. |
| ❌ Invalid | `[Inscribe:4,1]` | Invalid coordinate; grid is only 13. |
| ❌ Invalid | `[Inscribe:2,3,1]` | Too many arguments. |
| ❌ Invalid | `[Place:2,2]` | Incorrect token keyword. |
---
## 5. Game State Schema
Example runtime structure:
```json
{
"turn_count": 4,
"current_player": "Player 2",
"seed": 42,
"board": [
["X", "O", ""],
["", "X", ""],
["", "", "O"]
],
"players": {
"Player 1": {
"symbol": "X",
"moves_made": 2
},
"Player 2": {
"symbol": "O",
"moves_made": 2
}
},
"winner": null,
"is_terminal": false,
"last_action": "[Inscribe:3,3]",
"observation_log": [
"Player 1 placed at (1,1)",
"Player 2 placed at (1,2)",
"Player 1 placed at (2,2)",
"Player 2 placed at (3,3)"
]
}
```
---
## 6. Initialization Rules
- **Board Reset**: All grid cells empty (`""`).
- **Player Assignment**: Player 1 always receives `X`, Player 2 receives `O`.
- **Starting Player**: Determined by seed parity (`seed % 2 == 0` → Player 1 starts; otherwise Player 2).
- **Onboarding Observation**: The first system message announces who begins.
- **Seed**: Passed to `reset` for reproducibility. No other randomization beyond turn order.
---
## 7. Validation and Error Handling
**Illegal Action Detection** involves two checks:
1. **Format Check** (regex violation) → `Reason: "Invalid action format. Must match [Inscribe:x,y]"`.
2. **Coordinate Availability Check** (target cell already occupied) → `Reason: "Cell already occupied"`.
3. **Turn Check** (wrong player acting) → `Reason: "Not your turn"`.
When a violation occurs, the environment will invoke:
```
set_invalid_move(player_id, reason)
```
---
## 8. Terminal Conditions and Scoring
**Check after every move:**
1. **Winning Condition**: The active players move completes 3 identical glyphs in one line → `winner = current_player`, `is_terminal = true`.
2. **Draw Condition**: If no empty cells and no winner → `is_terminal = true`, `winner = "Draw"`.
3. **Scoring**:
- Winner: `+1`
- Loser: `0`
- Draw: `0.5` each (if required for aggregate scoring).
---
## 9. Player Prompt Specification
Each turn, players are prompted with the current grid, their symbol, and allowed action grammar.
**Prompt Outline:**
- **Identity Blurb:**
“You are a glyph inscriber in the digital arena, competing to align three of your runes in a row before your opponent does.”
- **Board Display:** Current 3×3 grid with coordinates labeled (13).
- **Rules Summary:**
- You must inscribe one cell per turn using the format `[Inscribe:x,y]`.
- Choose only empty cells within the board.
- The first to create a line of three identical symbols wins.
- **Formatting Rules:**
- Your private reasoning must be inside `<think></think>`.
- Your public action must be inside `<answer></answer>` and follow the grammar exactly.
**Examples:**
```
Example valid response:
<think>If I take center, I will block the opponents row.</think>
<answer>[Inscribe:2,2]</answer>
Example invalid response:
<think>I want the corner.</think>
<answer>[Place:3,3]</answer> <-- Invalid keyword
```
---
## 10. API Mapping Plan
### `reset(seed: int) -> game_state`
- Initializes board and variables following section 6.
- Logs first observation (“Player X begins”).
- Returns initial `game_state`.
### `step(player_id: str, action: str) -> game_state`
- Parses answer content via `_extract_answer_content`.
- Validates the format and move legality.
- Updates board, `observation_log`, and `turn_count`.
- Checks terminal conditions (win/draw).
- Switches `current_player` if not terminal.
### `_generate_player_prompt(player_id: str, game_state: dict) -> str`
- Renders a textual view of the grid.
- Includes symbol assigned, whose turn it is, available actions reminder, and formatting examples.
- Returns the complete prompt string for the active player.
---
## 11. Copy-Check Against the Example
This design:
- Uses **glyph inscription in a digital grid**, **not negotiation or resource-trade mechanics**.
- Introduces unique resource names (`glyphs`, `grid`, `inscriptions`) and customized keys (`board`, `symbol`, `moves_made`).
- Maintains a **distinct thematic narrative**, objective (aligning symbols), and deterministic progression rules.
- All terminology, `game_state` keys, and prompt text are **original to this design** and unrelated to the example.