Add environment documentation from Openverse builder
This commit is contained in:
210
environment.md
Normal file
210
environment.md
Normal file
@@ -0,0 +1,210 @@
|
||||
# TURN-BASED TEXTARENA DESIGN DOCUMENT
|
||||
## Game: **“Runestone Clash”** (Original deterministic twist on the tic-tac-toe concept)
|
||||
|
||||
---
|
||||
|
||||
### 1. Concept Paragraph
|
||||
|
||||
**Concept:**
|
||||
“Runestone Clash” is a **deterministic, two-player, turn-based tactical placement game** inspired by strategic grid contests but completely re-themed. Instead of “tic-tac-toe” or any familiar noughts-and-crosses motif, players are rival **Runemages** imprinting elemental sigils onto an ancient **3×3 Stone Circle**. Each mage seeks to align their magical runes to channel energy across the circle before their opponent. The two actions, **`[Imprint:x,y]`** and **`[Pass]`**, allow Runemages to either inscribe a rune in a cell or yield a turn strategically. The first mage to align three of their runes in any straight line (horizontal, vertical, or diagonal) harnesses the Circle’s full power and wins.
|
||||
|
||||
This design is entirely unrelated to any negotiation or trade scenario; it is purely an abstract magical duel of placement and pattern formation.
|
||||
|
||||
---
|
||||
|
||||
### 2. Roles and Win Condition
|
||||
|
||||
**Roles:**
|
||||
- **Player A (Runemage A):** Uses the sigil type `"⚙"` (symbolically A’s Rune).
|
||||
- **Player B (Runemage B):** Uses the sigil type `"✶"` (symbolically B’s Rune).
|
||||
|
||||
**Objective:**
|
||||
Be the first player to align three of your own runes in a line (row, column, or diagonal) within the 3×3 Stone Circle.
|
||||
|
||||
**Win, Loss, Draw Conditions:**
|
||||
- **Win:** A player has three of their rune symbols consecutively aligned.
|
||||
- **Loss:** The opponent achieves alignment first.
|
||||
- **Draw:** The grid fills without any valid alignment or the maximum of 9 placements is reached without a winner.
|
||||
|
||||
---
|
||||
|
||||
### 3. Turn Structure and Determinism
|
||||
|
||||
- The game alternates turns strictly: Player A → Player B → Player A → etc.
|
||||
- **Turn count:** Maximum of 9 turns total.
|
||||
- **Seed usage:** The environment supports seeding for deterministic setups (e.g., random starting player or aesthetic randomization of rune glow color). Using the same seed ensures reproducible initial order and minor randomized descriptors (purely cosmetic; gameplay deterministic).
|
||||
- No simultaneous turns; each action leads immediately to a state update before the next prompt.
|
||||
|
||||
---
|
||||
|
||||
### 4. Action Grammar (Machine-Parseable)
|
||||
|
||||
**Allowed actions:**
|
||||
1. **`[Imprint:x,y]`** — The player claims an unoccupied cell at coordinates `(x,y)` where `x,y ∈ {1,2,3}`.
|
||||
2. **`[Pass]`** — The player explicitly chooses to skip their turn (only allowed if at least one cell remains empty).
|
||||
|
||||
**Formal Patterns (Regex-style):**
|
||||
- **Imprint Action:** `^\[Imprint:(1|2|3),(1|2|3)\]$`
|
||||
- **Pass Action:** `^\[Pass\]$`
|
||||
|
||||
**Examples:**
|
||||
- ✅ Valid: `[Imprint:2,3]` → Places rune at row 2, column 3
|
||||
- ❌ Invalid: `[Imprint:4,5]` → Coordinates out of range
|
||||
- ✅ Valid: `[Pass]` → Skips the turn
|
||||
- ❌ Invalid: `[pass]` → Wrong capitalization; pattern mismatch
|
||||
|
||||
---
|
||||
|
||||
### 5. Game State Schema
|
||||
|
||||
Example runtime state representation:
|
||||
|
||||
```json
|
||||
{
|
||||
"turn_number": 5,
|
||||
"active_player": "PlayerB",
|
||||
"rune_grid": [
|
||||
["⚙", "✶", ""],
|
||||
["", "⚙", ""],
|
||||
["", "", "✶"]
|
||||
],
|
||||
"players": {
|
||||
"PlayerA": {
|
||||
"symbol": "⚙",
|
||||
"imprints": 3,
|
||||
"skips": 0,
|
||||
"status": "active"
|
||||
},
|
||||
"PlayerB": {
|
||||
"symbol": "✶",
|
||||
"imprints": 2,
|
||||
"skips": 1,
|
||||
"status": "active"
|
||||
}
|
||||
},
|
||||
"winner": null,
|
||||
"draw": false,
|
||||
"transcript": [
|
||||
{"player": "PlayerA", "action": "[Imprint:1,1]"},
|
||||
{"player": "PlayerB", "action": "[Imprint:1,2]"},
|
||||
{"player": "PlayerA", "action": "[Imprint:2,2]"},
|
||||
{"player": "PlayerB", "action": "[Pass]"}
|
||||
],
|
||||
"seed": 42
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 6. Initialization Rules
|
||||
|
||||
- **At reset:**
|
||||
- The 3×3 `rune_grid` is empty (`""` in all cells).
|
||||
- `turn_number = 1`.
|
||||
- Randomly select starting player (deterministic via seed).
|
||||
- Initialize transcript as empty.
|
||||
- Assign rune symbols to each player (`⚙` to PlayerA, `✶` to PlayerB).
|
||||
- **Onboarding observation:**
|
||||
Each player receives a description of the current empty grid, the symbols representing each player, and the list of valid actions.
|
||||
|
||||
---
|
||||
|
||||
### 7. Validation and Error Handling
|
||||
|
||||
**Validation flow:**
|
||||
1. Extract action content using `_extract_answer_content(action: str)` → content inside `\boxed{{}}`.
|
||||
2. Check against regex grammar.
|
||||
3. Additional semantic checks:
|
||||
- If `[Imprint:x,y]`: verify cell (x,y) is within bounds and unoccupied.
|
||||
- If `[Pass]`: verify at least one empty cell exists; otherwise invalid because passing on full board is nonsensical.
|
||||
|
||||
**Invalid Move Reasons (examples):**
|
||||
- `"Invalid syntax: does not match required action pattern."`
|
||||
- `"Invalid coordinates: cell (3,4) is outside grid boundaries."`
|
||||
- `"Cell already claimed by another rune."`
|
||||
- `"Cannot pass: grid fully imprinted."`
|
||||
|
||||
Each triggers `set_invalid_move(player_id, reason)`.
|
||||
|
||||
---
|
||||
|
||||
### 8. Terminal Conditions and Scoring
|
||||
|
||||
**Terminal Checks (after each move):**
|
||||
1. **Win Check:** If active player’s symbol forms an unbroken line of 3.
|
||||
- Set `winner = active_player`; other loses.
|
||||
2. **Draw Check:** If `turn_number > 9` or grid has no empty cells and no winner.
|
||||
- Set `draw = true; winner = null`.
|
||||
|
||||
**Scoring:**
|
||||
- Winner receives score = `1.0`; loser = `0.0`; draw = `0.5` each.
|
||||
**Tie-breaker:** Not applicable due to small deterministic play space.
|
||||
|
||||
---
|
||||
|
||||
### 9. Player Prompt Specification
|
||||
|
||||
**Prompt Theme:**
|
||||
> “You are a Runemage, locked in a battle over a mystic 3×3 Stone Circle. On your turn, you may inscribe your rune by specifying its coordinates, or you may pass (skip) if advantageous. The first to align three symbols in a straight line channels the Circle’s power and wins.”
|
||||
|
||||
**Prompt must include:**
|
||||
- Current `rune_grid` with coordinates displayed.
|
||||
- Your symbol vs. opponent’s symbol mapping.
|
||||
- Turn number and remaining open cells.
|
||||
- Allowed actions: `[Imprint:x,y]` where `x,y ∈ {1,2,3}` and target cell empty, or `[Pass]`.
|
||||
- Reminder:
|
||||
“Put your final answer within `\boxed{{}}` at the end of your response.”
|
||||
|
||||
**Few-shot Examples:**
|
||||
```
|
||||
Example valid response:
|
||||
I will secure the center of the Stone Circle.
|
||||
\boxed{{[Imprint:2,2]}}
|
||||
|
||||
Example invalid response (wrong format):
|
||||
\boxed{{Imprint:2,2}} <-- Missing square brackets and exact pattern.
|
||||
```
|
||||
|
||||
or
|
||||
|
||||
```
|
||||
Example valid response:
|
||||
The board is tight; I will bide my time.
|
||||
\boxed{{[Pass]}}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 10. API Mapping Plan
|
||||
|
||||
**`reset(seed=None)`**
|
||||
- Initializes the `game_state` as defined in Initialization Rules.
|
||||
- Applies deterministic seeding for reproducible starting order.
|
||||
- Returns initial observation (description of empty grid, role, and available actions).
|
||||
|
||||
**`step(player_action)`**
|
||||
- Extract action via `_extract_answer_content`.
|
||||
- Validate action syntactically and semantically.
|
||||
- If invalid → trigger `set_invalid_move`.
|
||||
- If valid → update grid, increment turn, append transcript.
|
||||
- Check for terminal conditions (win/draw).
|
||||
- If terminal → return final observation with results.
|
||||
- Else → swap active player, return updated state description.
|
||||
|
||||
**`_generate_player_prompt(active_player)`**
|
||||
- Builds narrative prompt containing current board, role identity, action grammar reminders, and the response format rule.
|
||||
- Appends examples showing correct `\boxed{{}}` usage.
|
||||
- Includes last move from transcript for context.
|
||||
|
||||
---
|
||||
|
||||
### 11. Copy-Check Against the Example
|
||||
|
||||
- **Verification:**
|
||||
- **Theme and Entities:** “Runemage,” “Runestone,” “Sigils,” and “Stone Circle” are entirely original concepts, not related to negotiation, trade, or social bargaining.
|
||||
- **Objective:** Spatial rune alignment contest; no resource exchange.
|
||||
- **Game State Keys (`rune_grid`, `imprints`, `transcript`, etc.)** are unique to this design.
|
||||
- **Terminology:** No reuse of example domain (offers, agreements, coins, etc.).
|
||||
- **Prompt:** Describes mystical grid placement, aligning sigils — nothing about the example negotiation setup.
|
||||
|
||||
Hence, this is a **distinct deterministic turn-based environment**, wholly aligned with the tic-tac-toe-style request but fully reimagined with unique narrative, state schema, and parsing logic.
|
||||
Reference in New Issue
Block a user