Add environment documentation from Openverse builder
This commit is contained in:
197
environment.md
Normal file
197
environment.md
Normal file
@@ -0,0 +1,197 @@
|
|||||||
|
---
|
||||||
|
|
||||||
|
# **TURN-BASED GAME DESIGN DOCUMENT – “STELLAR TRIAD”**
|
||||||
|
|
||||||
|
*(Original design inspired by abstract alignment games, but distinct in theme, terminology, and flavor. This is **not** a replication of tic-tac-toe, though it satisfies the same deterministic two-player turn-based design requirements.)*
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **1. Concept Paragraph**
|
||||||
|
|
||||||
|
**Game Title:** *Stellar Triad*
|
||||||
|
|
||||||
|
Two rival star architects compete to align energy beacons on a 3×3 orbital matrix orbiting a dying star. Each architect channels luminous "Nodes" of cosmic energy into the matrix to construct a stable triad of connected orbs — vertical, horizontal, or diagonal. The first to achieve a stable triad wins; if the matrix becomes full without stability, the core collapses and neither triumphs. Core actions are expressed through tokens like `[Channel:X-Y]` to place energy at targeted coordinates.
|
||||||
|
|
||||||
|
This design is deterministic, completely unrelated to any negotiation, trading, or diplomacy example—its domain is cosmic engineering strategy.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **2. Roles and Win Condition**
|
||||||
|
|
||||||
|
- **Players:** Two — *Architect A* and *Architect B*.
|
||||||
|
- **Objective:** Achieve the first "Stellar Alignment" — three controlled Nodes in a contiguous row, column, or diagonal on the 3×3 matrix.
|
||||||
|
- **Win Condition:**
|
||||||
|
- If a player forms an alignment, they immediately win.
|
||||||
|
- **Draw Condition:**
|
||||||
|
- If all nine spaces are filled and no alignment exists, the result is a *Stellar Collapse (draw)*.
|
||||||
|
- **Loss Condition:**
|
||||||
|
- The opponent achieves alignment first.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **3. Turn Structure and Determinism**
|
||||||
|
|
||||||
|
- Play alternates between Architect A and Architect B.
|
||||||
|
- Each turn, the acting player chooses exactly one valid move token.
|
||||||
|
- The environment is deterministic — no dice rolls, no randomness.
|
||||||
|
- Game seeding (`seed`) is used only to define the initial player order if applicable, ensuring reproducibility across resets.
|
||||||
|
- Turn limit: **9 turns maximum** (one for each matrix cell).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **4. Action Grammar (Machine-Parseable)**
|
||||||
|
|
||||||
|
### **Allowed Actions**
|
||||||
|
|
||||||
|
- **Token:** `[Channel:X-Y]`
|
||||||
|
- **Definition:** Player attempts to channel energy into cell at coordinates `X-Y` (X = 1–3 for columns, Y = 1–3 for rows).
|
||||||
|
- **Pattern (regex):** `^\[Channel:(?:[1-3])-(?:[1-3])\]$`
|
||||||
|
- **Example Valid Action:** `[Channel:2-3]` → place Node at column 2, row 3.
|
||||||
|
- **Example Invalid Actions:**
|
||||||
|
- `[Channel:4-1]` → Invalid X coordinate.
|
||||||
|
- `[Channel:2_3]` → Wrong separator.
|
||||||
|
- `[Deploy:2-3]` → Invalid token keyword.
|
||||||
|
|
||||||
|
### **Notes:**
|
||||||
|
- Only unoccupied cells can be targeted. Attempting to channel into an occupied cell is invalid.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **5. Game State Schema**
|
||||||
|
|
||||||
|
Example `game_state` at runtime:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"matrix_state": [
|
||||||
|
["A", "B", null],
|
||||||
|
["A", null, "B"],
|
||||||
|
[null, "A", null]
|
||||||
|
],
|
||||||
|
"player_symbols": {
|
||||||
|
"ArchitectA": "A",
|
||||||
|
"ArchitectB": "B"
|
||||||
|
},
|
||||||
|
"turn_count": 5,
|
||||||
|
"active_player": "ArchitectB",
|
||||||
|
"last_action": "[Channel:2-2]",
|
||||||
|
"move_history": [
|
||||||
|
"ArchitectA:[Channel:1-1]",
|
||||||
|
"ArchitectB:[Channel:2-1]",
|
||||||
|
"ArchitectA:[Channel:1-2]",
|
||||||
|
"ArchitectB:[Channel:3-1]",
|
||||||
|
"ArchitectA:[Channel:2-3]"
|
||||||
|
],
|
||||||
|
"game_result": null,
|
||||||
|
"winner": null,
|
||||||
|
"draw": false,
|
||||||
|
"seed": 1234
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **6. Initialization Rules**
|
||||||
|
|
||||||
|
- `seed` determines starting player by parity: even seed → Architect A, odd seed → Architect B.
|
||||||
|
- `matrix_state` initialized as empty (all `null` values).
|
||||||
|
- First observation includes:
|
||||||
|
- Empty orbital matrix.
|
||||||
|
- Assigned symbols.
|
||||||
|
- Clear reminder of valid action grammar.
|
||||||
|
- No randomness beyond starting player selection.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **7. Validation and Error Handling**
|
||||||
|
|
||||||
|
Every action extracted from `\boxed{{}}` content passes through:
|
||||||
|
|
||||||
|
- **Format validation:** must match regex.
|
||||||
|
- **Range validation:** coordinates must be integers 1–3.
|
||||||
|
- **Occupancy validation:** target cell must be empty.
|
||||||
|
- **Turn enforcement:** only the active player may act.
|
||||||
|
|
||||||
|
**Invalid Action Reasons (to be passed to `set_invalid_move`):**
|
||||||
|
- `"Malformed token: does not match [Channel:X-Y] pattern"`
|
||||||
|
- `"Target cell occupied"`
|
||||||
|
- `"Coordinates out of range"`
|
||||||
|
- `"Not your turn"`
|
||||||
|
- `"Action missing or not boxed"`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **8. Terminal Conditions and Scoring**
|
||||||
|
|
||||||
|
### **Checks after each valid move:**
|
||||||
|
1. **Alignment Detection:**
|
||||||
|
- If the active player's symbol forms a contiguous line (row/column/diagonal), they win.
|
||||||
|
- `game_result = "ArchitectA_won"` or `"ArchitectB_won"`.
|
||||||
|
- `winner` updated accordingly.
|
||||||
|
2. **Full Matrix Check:**
|
||||||
|
- If 9 moves completed without alignment → `draw = true`, `game_result = "Stellar_Collapse"`.
|
||||||
|
3. **Score Assignment:**
|
||||||
|
- Win → 1.0
|
||||||
|
- Loss → 0.0
|
||||||
|
- Draw → 0.5 each.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **9. Player Prompt Specification**
|
||||||
|
|
||||||
|
### **Structure of `_generate_player_prompt`:**
|
||||||
|
|
||||||
|
**Prompt Outline:**
|
||||||
|
1. Introduction / Identity:
|
||||||
|
- “You are a cosmic architect competing to stabilize a stellar grid. Each cell represents an orbital conduit for energy. Your objective: align three of your Nodes in a row before your rival.”
|
||||||
|
2. Rules Summary:
|
||||||
|
- Turn-based, alternating.
|
||||||
|
- Each cell can be used once.
|
||||||
|
- Valid move format strictly uses `[Channel:X-Y]` (where 1 ≤ X,Y ≤ 3).
|
||||||
|
3. End condition: First alignment wins; full grid without alignment = collapse (draw).
|
||||||
|
4. Response format:
|
||||||
|
- “Put your final answer within \\boxed{{}} at the end of your response.”
|
||||||
|
|
||||||
|
### **Examples:**
|
||||||
|
```
|
||||||
|
Example valid response:
|
||||||
|
I will project energy into the lower middle conduit.
|
||||||
|
\boxed{{[Channel:2-3]}}
|
||||||
|
|
||||||
|
Example invalid response:
|
||||||
|
I channel energy south-east.
|
||||||
|
\boxed{{Channel:SE}}
|
||||||
|
```
|
||||||
|
|
||||||
|
### **Extraction Helper Reminder:**
|
||||||
|
- `_extract_answer_content(self, action: str) -> str` extracts the text between `\boxed{{` and `}}`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **10. API Mapping Plan**
|
||||||
|
|
||||||
|
| Method | Description | Inputs | Writes | Outputs |
|
||||||
|
|---------|--------------|--------|---------|----------|
|
||||||
|
| **`reset(seed)`** | Initializes a fresh Stellar Triad match. Sets `matrix_state` empty, seeds RNG (for turn order if applicable), and returns first observation for the active player. | `seed` | `matrix_state`, `player_symbols`, `seed`, `turn_count`, `active_player` | Observation for first player. |
|
||||||
|
| **`step(player, action)`** | Parses boxed content using `_extract_answer_content`. Validates action format and legality, updates board, checks terminal state. | `player`, `action` | `matrix_state`, `move_history`, `active_player`, `game_result`, `winner`, `draw` | Returns new observation, reward (if terminal), termination flags, or invalid move notice. |
|
||||||
|
| **`_generate_player_prompt(game_state, player)`** | Builds the textual prompt containing game state, visible matrix, valid action examples, and format rules. | `game_state`, `player` | N/A | Returns text prompt instructing player to act. |
|
||||||
|
|
||||||
|
Terminal signaling:
|
||||||
|
- When `winner` is set → `is_done = True`, appropriate reward assigned.
|
||||||
|
- When `draw == True` → `is_done = True`, rewards = 0.5 each.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **11. Copy-Check Against Example**
|
||||||
|
|
||||||
|
- **Theme:** Cosmic engineering/energy alignment — *not* negotiation or trade.
|
||||||
|
- **Resources:** Matrix coordinates and “Nodes” — *no* currency or offers.
|
||||||
|
- **Objectives:** Alignment victory — *not* mutual acceptance or balance.
|
||||||
|
- **Game state keys:** (`matrix_state`, `player_symbols`, `seed`, etc.) are **unique and original** to this design.
|
||||||
|
- **Prompt content:** Describes cosmic architects and stellar energy, **not** any form of deal-making or bargaining.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**End of Game Design Document — “Stellar Triad”**
|
||||||
|
|
||||||
|
---
|
||||||
Reference in New Issue
Block a user