Add environment documentation from Openverse builder
This commit is contained in:
197
environment.md
Normal file
197
environment.md
Normal file
@@ -0,0 +1,197 @@
|
||||
---
|
||||
|
||||
# **TURN-BASED GAME DESIGN DOCUMENT – “STELLAR TRIAD”**
|
||||
|
||||
*(Original design inspired by abstract alignment games, but distinct in theme, terminology, and flavor. This is **not** a replication of tic-tac-toe, though it satisfies the same deterministic two-player turn-based design requirements.)*
|
||||
|
||||
---
|
||||
|
||||
## **1. Concept Paragraph**
|
||||
|
||||
**Game Title:** *Stellar Triad*
|
||||
|
||||
Two rival star architects compete to align energy beacons on a 3×3 orbital matrix orbiting a dying star. Each architect channels luminous "Nodes" of cosmic energy into the matrix to construct a stable triad of connected orbs — vertical, horizontal, or diagonal. The first to achieve a stable triad wins; if the matrix becomes full without stability, the core collapses and neither triumphs. Core actions are expressed through tokens like `[Channel:X-Y]` to place energy at targeted coordinates.
|
||||
|
||||
This design is deterministic, completely unrelated to any negotiation, trading, or diplomacy example—its domain is cosmic engineering strategy.
|
||||
|
||||
---
|
||||
|
||||
## **2. Roles and Win Condition**
|
||||
|
||||
- **Players:** Two — *Architect A* and *Architect B*.
|
||||
- **Objective:** Achieve the first "Stellar Alignment" — three controlled Nodes in a contiguous row, column, or diagonal on the 3×3 matrix.
|
||||
- **Win Condition:**
|
||||
- If a player forms an alignment, they immediately win.
|
||||
- **Draw Condition:**
|
||||
- If all nine spaces are filled and no alignment exists, the result is a *Stellar Collapse (draw)*.
|
||||
- **Loss Condition:**
|
||||
- The opponent achieves alignment first.
|
||||
|
||||
---
|
||||
|
||||
## **3. Turn Structure and Determinism**
|
||||
|
||||
- Play alternates between Architect A and Architect B.
|
||||
- Each turn, the acting player chooses exactly one valid move token.
|
||||
- The environment is deterministic — no dice rolls, no randomness.
|
||||
- Game seeding (`seed`) is used only to define the initial player order if applicable, ensuring reproducibility across resets.
|
||||
- Turn limit: **9 turns maximum** (one for each matrix cell).
|
||||
|
||||
---
|
||||
|
||||
## **4. Action Grammar (Machine-Parseable)**
|
||||
|
||||
### **Allowed Actions**
|
||||
|
||||
- **Token:** `[Channel:X-Y]`
|
||||
- **Definition:** Player attempts to channel energy into cell at coordinates `X-Y` (X = 1–3 for columns, Y = 1–3 for rows).
|
||||
- **Pattern (regex):** `^\[Channel:(?:[1-3])-(?:[1-3])\]$`
|
||||
- **Example Valid Action:** `[Channel:2-3]` → place Node at column 2, row 3.
|
||||
- **Example Invalid Actions:**
|
||||
- `[Channel:4-1]` → Invalid X coordinate.
|
||||
- `[Channel:2_3]` → Wrong separator.
|
||||
- `[Deploy:2-3]` → Invalid token keyword.
|
||||
|
||||
### **Notes:**
|
||||
- Only unoccupied cells can be targeted. Attempting to channel into an occupied cell is invalid.
|
||||
|
||||
---
|
||||
|
||||
## **5. Game State Schema**
|
||||
|
||||
Example `game_state` at runtime:
|
||||
|
||||
```json
|
||||
{
|
||||
"matrix_state": [
|
||||
["A", "B", null],
|
||||
["A", null, "B"],
|
||||
[null, "A", null]
|
||||
],
|
||||
"player_symbols": {
|
||||
"ArchitectA": "A",
|
||||
"ArchitectB": "B"
|
||||
},
|
||||
"turn_count": 5,
|
||||
"active_player": "ArchitectB",
|
||||
"last_action": "[Channel:2-2]",
|
||||
"move_history": [
|
||||
"ArchitectA:[Channel:1-1]",
|
||||
"ArchitectB:[Channel:2-1]",
|
||||
"ArchitectA:[Channel:1-2]",
|
||||
"ArchitectB:[Channel:3-1]",
|
||||
"ArchitectA:[Channel:2-3]"
|
||||
],
|
||||
"game_result": null,
|
||||
"winner": null,
|
||||
"draw": false,
|
||||
"seed": 1234
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## **6. Initialization Rules**
|
||||
|
||||
- `seed` determines starting player by parity: even seed → Architect A, odd seed → Architect B.
|
||||
- `matrix_state` initialized as empty (all `null` values).
|
||||
- First observation includes:
|
||||
- Empty orbital matrix.
|
||||
- Assigned symbols.
|
||||
- Clear reminder of valid action grammar.
|
||||
- No randomness beyond starting player selection.
|
||||
|
||||
---
|
||||
|
||||
## **7. Validation and Error Handling**
|
||||
|
||||
Every action extracted from `\boxed{{}}` content passes through:
|
||||
|
||||
- **Format validation:** must match regex.
|
||||
- **Range validation:** coordinates must be integers 1–3.
|
||||
- **Occupancy validation:** target cell must be empty.
|
||||
- **Turn enforcement:** only the active player may act.
|
||||
|
||||
**Invalid Action Reasons (to be passed to `set_invalid_move`):**
|
||||
- `"Malformed token: does not match [Channel:X-Y] pattern"`
|
||||
- `"Target cell occupied"`
|
||||
- `"Coordinates out of range"`
|
||||
- `"Not your turn"`
|
||||
- `"Action missing or not boxed"`
|
||||
|
||||
---
|
||||
|
||||
## **8. Terminal Conditions and Scoring**
|
||||
|
||||
### **Checks after each valid move:**
|
||||
1. **Alignment Detection:**
|
||||
- If the active player's symbol forms a contiguous line (row/column/diagonal), they win.
|
||||
- `game_result = "ArchitectA_won"` or `"ArchitectB_won"`.
|
||||
- `winner` updated accordingly.
|
||||
2. **Full Matrix Check:**
|
||||
- If 9 moves completed without alignment → `draw = true`, `game_result = "Stellar_Collapse"`.
|
||||
3. **Score Assignment:**
|
||||
- Win → 1.0
|
||||
- Loss → 0.0
|
||||
- Draw → 0.5 each.
|
||||
|
||||
---
|
||||
|
||||
## **9. Player Prompt Specification**
|
||||
|
||||
### **Structure of `_generate_player_prompt`:**
|
||||
|
||||
**Prompt Outline:**
|
||||
1. Introduction / Identity:
|
||||
- “You are a cosmic architect competing to stabilize a stellar grid. Each cell represents an orbital conduit for energy. Your objective: align three of your Nodes in a row before your rival.”
|
||||
2. Rules Summary:
|
||||
- Turn-based, alternating.
|
||||
- Each cell can be used once.
|
||||
- Valid move format strictly uses `[Channel:X-Y]` (where 1 ≤ X,Y ≤ 3).
|
||||
3. End condition: First alignment wins; full grid without alignment = collapse (draw).
|
||||
4. Response format:
|
||||
- “Put your final answer within \\boxed{{}} at the end of your response.”
|
||||
|
||||
### **Examples:**
|
||||
```
|
||||
Example valid response:
|
||||
I will project energy into the lower middle conduit.
|
||||
\boxed{{[Channel:2-3]}}
|
||||
|
||||
Example invalid response:
|
||||
I channel energy south-east.
|
||||
\boxed{{Channel:SE}}
|
||||
```
|
||||
|
||||
### **Extraction Helper Reminder:**
|
||||
- `_extract_answer_content(self, action: str) -> str` extracts the text between `\boxed{{` and `}}`.
|
||||
|
||||
---
|
||||
|
||||
## **10. API Mapping Plan**
|
||||
|
||||
| Method | Description | Inputs | Writes | Outputs |
|
||||
|---------|--------------|--------|---------|----------|
|
||||
| **`reset(seed)`** | Initializes a fresh Stellar Triad match. Sets `matrix_state` empty, seeds RNG (for turn order if applicable), and returns first observation for the active player. | `seed` | `matrix_state`, `player_symbols`, `seed`, `turn_count`, `active_player` | Observation for first player. |
|
||||
| **`step(player, action)`** | Parses boxed content using `_extract_answer_content`. Validates action format and legality, updates board, checks terminal state. | `player`, `action` | `matrix_state`, `move_history`, `active_player`, `game_result`, `winner`, `draw` | Returns new observation, reward (if terminal), termination flags, or invalid move notice. |
|
||||
| **`_generate_player_prompt(game_state, player)`** | Builds the textual prompt containing game state, visible matrix, valid action examples, and format rules. | `game_state`, `player` | N/A | Returns text prompt instructing player to act. |
|
||||
|
||||
Terminal signaling:
|
||||
- When `winner` is set → `is_done = True`, appropriate reward assigned.
|
||||
- When `draw == True` → `is_done = True`, rewards = 0.5 each.
|
||||
|
||||
---
|
||||
|
||||
## **11. Copy-Check Against Example**
|
||||
|
||||
- **Theme:** Cosmic engineering/energy alignment — *not* negotiation or trade.
|
||||
- **Resources:** Matrix coordinates and “Nodes” — *no* currency or offers.
|
||||
- **Objectives:** Alignment victory — *not* mutual acceptance or balance.
|
||||
- **Game state keys:** (`matrix_state`, `player_symbols`, `seed`, etc.) are **unique and original** to this design.
|
||||
- **Prompt content:** Describes cosmic architects and stellar energy, **not** any form of deal-making or bargaining.
|
||||
|
||||
---
|
||||
|
||||
**End of Game Design Document — “Stellar Triad”**
|
||||
|
||||
---
|
||||
Reference in New Issue
Block a user