Files
tic/environment.md

197 lines
7.5 KiB
Markdown
Raw Normal View History

---
# **TURN-BASED GAME DESIGN DOCUMENT “STELLAR TRIAD”**
*(Original design inspired by abstract alignment games, but distinct in theme, terminology, and flavor. This is **not** a replication of tic-tac-toe, though it satisfies the same deterministic two-player turn-based design requirements.)*
---
## **1. Concept Paragraph**
**Game Title:** *Stellar Triad*
Two rival star architects compete to align energy beacons on a 3×3 orbital matrix orbiting a dying star. Each architect channels luminous "Nodes" of cosmic energy into the matrix to construct a stable triad of connected orbs — vertical, horizontal, or diagonal. The first to achieve a stable triad wins; if the matrix becomes full without stability, the core collapses and neither triumphs. Core actions are expressed through tokens like `[Channel:X-Y]` to place energy at targeted coordinates.
This design is deterministic, completely unrelated to any negotiation, trading, or diplomacy example—its domain is cosmic engineering strategy.
---
## **2. Roles and Win Condition**
- **Players:** Two — *Architect A* and *Architect B*.
- **Objective:** Achieve the first "Stellar Alignment" — three controlled Nodes in a contiguous row, column, or diagonal on the 3×3 matrix.
- **Win Condition:**
- If a player forms an alignment, they immediately win.
- **Draw Condition:**
- If all nine spaces are filled and no alignment exists, the result is a *Stellar Collapse (draw)*.
- **Loss Condition:**
- The opponent achieves alignment first.
---
## **3. Turn Structure and Determinism**
- Play alternates between Architect A and Architect B.
- Each turn, the acting player chooses exactly one valid move token.
- The environment is deterministic — no dice rolls, no randomness.
- Game seeding (`seed`) is used only to define the initial player order if applicable, ensuring reproducibility across resets.
- Turn limit: **9 turns maximum** (one for each matrix cell).
---
## **4. Action Grammar (Machine-Parseable)**
### **Allowed Actions**
- **Token:** `[Channel:X-Y]`
- **Definition:** Player attempts to channel energy into cell at coordinates `X-Y` (X = 13 for columns, Y = 13 for rows).
- **Pattern (regex):** `^\[Channel:(?:[1-3])-(?:[1-3])\]$`
- **Example Valid Action:** `[Channel:2-3]` → place Node at column 2, row 3.
- **Example Invalid Actions:**
- `[Channel:4-1]` → Invalid X coordinate.
- `[Channel:2_3]` → Wrong separator.
- `[Deploy:2-3]` → Invalid token keyword.
### **Notes:**
- Only unoccupied cells can be targeted. Attempting to channel into an occupied cell is invalid.
---
## **5. Game State Schema**
Example `game_state` at runtime:
```json
{
"matrix_state": [
["A", "B", null],
["A", null, "B"],
[null, "A", null]
],
"player_symbols": {
"ArchitectA": "A",
"ArchitectB": "B"
},
"turn_count": 5,
"active_player": "ArchitectB",
"last_action": "[Channel:2-2]",
"move_history": [
"ArchitectA:[Channel:1-1]",
"ArchitectB:[Channel:2-1]",
"ArchitectA:[Channel:1-2]",
"ArchitectB:[Channel:3-1]",
"ArchitectA:[Channel:2-3]"
],
"game_result": null,
"winner": null,
"draw": false,
"seed": 1234
}
```
---
## **6. Initialization Rules**
- `seed` determines starting player by parity: even seed → Architect A, odd seed → Architect B.
- `matrix_state` initialized as empty (all `null` values).
- First observation includes:
- Empty orbital matrix.
- Assigned symbols.
- Clear reminder of valid action grammar.
- No randomness beyond starting player selection.
---
## **7. Validation and Error Handling**
Every action extracted from `\boxed{{}}` content passes through:
- **Format validation:** must match regex.
- **Range validation:** coordinates must be integers 13.
- **Occupancy validation:** target cell must be empty.
- **Turn enforcement:** only the active player may act.
**Invalid Action Reasons (to be passed to `set_invalid_move`):**
- `"Malformed token: does not match [Channel:X-Y] pattern"`
- `"Target cell occupied"`
- `"Coordinates out of range"`
- `"Not your turn"`
- `"Action missing or not boxed"`
---
## **8. Terminal Conditions and Scoring**
### **Checks after each valid move:**
1. **Alignment Detection:**
- If the active player's symbol forms a contiguous line (row/column/diagonal), they win.
- `game_result = "ArchitectA_won"` or `"ArchitectB_won"`.
- `winner` updated accordingly.
2. **Full Matrix Check:**
- If 9 moves completed without alignment → `draw = true`, `game_result = "Stellar_Collapse"`.
3. **Score Assignment:**
- Win → 1.0
- Loss → 0.0
- Draw → 0.5 each.
---
## **9. Player Prompt Specification**
### **Structure of `_generate_player_prompt`:**
**Prompt Outline:**
1. Introduction / Identity:
- “You are a cosmic architect competing to stabilize a stellar grid. Each cell represents an orbital conduit for energy. Your objective: align three of your Nodes in a row before your rival.”
2. Rules Summary:
- Turn-based, alternating.
- Each cell can be used once.
- Valid move format strictly uses `[Channel:X-Y]` (where 1 ≤ X,Y ≤ 3).
3. End condition: First alignment wins; full grid without alignment = collapse (draw).
4. Response format:
- “Put your final answer within \\boxed{{}} at the end of your response.”
### **Examples:**
```
Example valid response:
I will project energy into the lower middle conduit.
\boxed{{[Channel:2-3]}}
Example invalid response:
I channel energy south-east.
\boxed{{Channel:SE}}
```
### **Extraction Helper Reminder:**
- `_extract_answer_content(self, action: str) -> str` extracts the text between `\boxed{{` and `}}`.
---
## **10. API Mapping Plan**
| Method | Description | Inputs | Writes | Outputs |
|---------|--------------|--------|---------|----------|
| **`reset(seed)`** | Initializes a fresh Stellar Triad match. Sets `matrix_state` empty, seeds RNG (for turn order if applicable), and returns first observation for the active player. | `seed` | `matrix_state`, `player_symbols`, `seed`, `turn_count`, `active_player` | Observation for first player. |
| **`step(player, action)`** | Parses boxed content using `_extract_answer_content`. Validates action format and legality, updates board, checks terminal state. | `player`, `action` | `matrix_state`, `move_history`, `active_player`, `game_result`, `winner`, `draw` | Returns new observation, reward (if terminal), termination flags, or invalid move notice. |
| **`_generate_player_prompt(game_state, player)`** | Builds the textual prompt containing game state, visible matrix, valid action examples, and format rules. | `game_state`, `player` | N/A | Returns text prompt instructing player to act. |
Terminal signaling:
- When `winner` is set → `is_done = True`, appropriate reward assigned.
- When `draw == True``is_done = True`, rewards = 0.5 each.
---
## **11. Copy-Check Against Example**
- **Theme:** Cosmic engineering/energy alignment — *not* negotiation or trade.
- **Resources:** Matrix coordinates and “Nodes” — *no* currency or offers.
- **Objectives:** Alignment victory — *not* mutual acceptance or balance.
- **Game state keys:** (`matrix_state`, `player_symbols`, `seed`, etc.) are **unique and original** to this design.
- **Prompt content:** Describes cosmic architects and stellar energy, **not** any form of deal-making or bargaining.
---
**End of Game Design Document — “Stellar Triad”**
---