Add environment documentation from Openverse builder
This commit is contained in:
193
environment.md
Normal file
193
environment.md
Normal file
@@ -0,0 +1,193 @@
|
||||
# **Game Design Document: “Orbital Align” (Deterministic Turn-Based Strategy Inspired by Tic-Tac-Toe)**
|
||||
|
||||
---
|
||||
|
||||
## 1. Concept Paragraph
|
||||
|
||||
**Setting & Theme:**
|
||||
In *Orbital Align*, two rival star captains compete to align their fleets of orbital satellites across a 3×3 planetary grid suspended around a dying star. Unlike classic tic-tac-toe, this version reimagines the board as orbital nodes where each satellite placement represents a strategic claim of spatial control. The goal is to align three satellites in a row—horizontally, vertically, or diagonally—before the opponent does.
|
||||
**Core action tokens:**
|
||||
`[Deploy:x,y]` (to place a satellite on coordinates), and `[Scan]` (forfeit placement to reveal the current grid state).
|
||||
This design is *completely unrelated* to any previous negotiation or resource trading example. It uses a new setting, terminology, and objectives.
|
||||
|
||||
---
|
||||
|
||||
## 2. Roles and Win Condition
|
||||
|
||||
**Roles:**
|
||||
- **Player A (Commander Solis)** and **Player B (Commander Nyx)** each command a distinct orbital fleet.
|
||||
- Each player’s satellite is marked distinctly (`S` for Solis, `N` for Nyx`).
|
||||
|
||||
**Win Condition:**
|
||||
- A player wins if they align **three of their satellites** consecutively in any row, column, or diagonal.
|
||||
- If all nine grid cells are filled without a winning alignment, the result is a **draw**.
|
||||
|
||||
**Loss Condition:**
|
||||
- A player loses if the opponent achieves an alignment before them.
|
||||
- A player also loses immediately if they perform an **invalid action** that cannot be corrected within the same turn.
|
||||
|
||||
---
|
||||
|
||||
## 3. Turn Structure and Determinism
|
||||
|
||||
- The game progresses **alternating turns**, starting with Commander Solis (Player A).
|
||||
- **Each turn**: Current player chooses one action (`Deploy` or `Scan`).
|
||||
- Maximum **turn limit**: 9 (the grid has 9 total cells).
|
||||
- The environment uses a reproducible **random seed**—though this game itself has no stochastic actions, seeding ensures deterministic ordering if future extensions add random elements.
|
||||
|
||||
---
|
||||
|
||||
## 4. Action Grammar (Machine-Parseable)
|
||||
|
||||
**Permitted Action Tokens**
|
||||
|
||||
| Action | Meaning | Formal Regex | Example Valid | Example Invalid | Reason Invalid |
|
||||
|:--|:--|:--|:--|:--|:--|
|
||||
| `[Deploy:x,y]` | Place a satellite at coordinates (x,y) where x,y ∈ {1,2,3} | `^\[Deploy:(?:[1-3]),(?:[1-3])\]$` | `[Deploy:2,3]` | `[Deploy:4,1]` | 4 outside valid range |
|
||||
| `[Scan]` | View the current orbital grid instead of placing | `^\[Scan\]$` | `[Scan]` | `[ScanGrid]` | Incorrect token name |
|
||||
|
||||
**Rules:**
|
||||
- Coordinates (x,y) correspond to the grid: (1,1) = top-left, (3,3) = bottom-right.
|
||||
- No double occupation allowed—if a player tries to `Deploy` on an occupied node, it is invalid.
|
||||
|
||||
---
|
||||
|
||||
## 5. Game State Schema
|
||||
|
||||
Example serialized game state:
|
||||
|
||||
```json
|
||||
{
|
||||
"turn_count": 5,
|
||||
"current_player": "Commander Solis",
|
||||
"board": [
|
||||
["S", "N", " "],
|
||||
[" ", "S", " "],
|
||||
["N", " ", " "]
|
||||
],
|
||||
"players": {
|
||||
"Commander Solis": {
|
||||
"symbol": "S",
|
||||
"actions_taken": ["[Deploy:1,1]", "[Deploy:2,2]", "[Deploy:3,1]"]
|
||||
},
|
||||
"Commander Nyx": {
|
||||
"symbol": "N",
|
||||
"actions_taken": ["[Deploy:1,2]", "[Deploy:3,1]"]
|
||||
}
|
||||
},
|
||||
"winner": null,
|
||||
"is_terminal": false,
|
||||
"last_action": "[Deploy:2,2]",
|
||||
"observation_log": [
|
||||
"Commander Solis deployed to 1,1",
|
||||
"Commander Nyx deployed to 1,2",
|
||||
"Commander Solis deployed to 2,2"
|
||||
],
|
||||
"seed": 42
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Initialization Rules
|
||||
|
||||
- **Board**: Empty 3×3 grid represented as a list of lists containing `" "`.
|
||||
- **Starting player**: Commander Solis always starts.
|
||||
- **Seeding**: Random seed (e.g., `seed=42`) stored in `game_state` for deterministic replay.
|
||||
- **Onboarding observations**:
|
||||
Upon `reset`, each player receives:
|
||||
- The empty grid state.
|
||||
- Instructions on how to deploy satellites and when the game concludes.
|
||||
|
||||
---
|
||||
|
||||
## 7. Validation and Error Handling
|
||||
|
||||
**Validation checks in order:**
|
||||
1. Verify that the extracted content matches one of the valid action patterns.
|
||||
2. For `[Deploy:x,y]`, ensure:
|
||||
- x, y within range 1–3.
|
||||
- Target cell is empty.
|
||||
3. For `[Scan]`, ensure no other content is appended.
|
||||
4. If the regex or move legality fails, call
|
||||
`set_invalid_move(player, reason)`
|
||||
with one of:
|
||||
- `"Malformed action syntax"`
|
||||
- `"Coordinates out of range"`
|
||||
- `"Target cell occupied"`
|
||||
- `"Unrecognized action token"`
|
||||
|
||||
Action extraction must strip wrapping `\boxed{{...}}`, leaving only the internal content for validation.
|
||||
|
||||
---
|
||||
|
||||
## 8. Terminal Conditions and Scoring
|
||||
|
||||
**After each move**, the system checks:
|
||||
|
||||
1. **Win Check:**
|
||||
- Rows, columns, and diagonals scanned for `['S', 'S', 'S']` or `['N', 'N', 'N']`.
|
||||
- The corresponding player is marked `winner`.
|
||||
2. **Draw Check:**
|
||||
- If `turn_count == 9` and no winner ⇒ `"DRAW"`.
|
||||
3. **Score Rules:**
|
||||
- Winner = 1, Loser = 0.
|
||||
- In draw = 0.5 each.
|
||||
|
||||
Tie-breakers are deterministic—no randomness or hidden state.
|
||||
|
||||
---
|
||||
|
||||
## 9. Player Prompt Specification
|
||||
|
||||
**Prompt Outline:**
|
||||
|
||||
> **IDENTITY BLURB:**
|
||||
> You are a star commander controlling a fleet of satellites orbiting a dying star. Your mission is to align three of your satellites in a row across the 3×3 orbital grid before your rival does.
|
||||
>
|
||||
> **CURRENT STATE:**
|
||||
> - The board shows your placements (S) and your opponent’s (N).
|
||||
> - Empty cells are blank spaces.
|
||||
>
|
||||
> **AVAILABLE ACTIONS:**
|
||||
> - `[Deploy:x,y]` → Place your satellite at coordinates (x,y) where x,y ∈ {1,2,3}.
|
||||
> - `[Scan]` → Forfeit placement this turn to inspect the full orbital map.
|
||||
>
|
||||
> **FORMAT RULES:**
|
||||
> - Each response must end with: `\boxed{{<action>}}`
|
||||
> - Example of valid response:
|
||||
> ```
|
||||
> I will secure the top-right orbit next.
|
||||
> \boxed{{[Deploy:1,3]}}
|
||||
> ```
|
||||
> - Example of invalid response:
|
||||
> ```
|
||||
> Let’s attack next time.
|
||||
> [Deploy:1,3]
|
||||
> ```
|
||||
> (Because it's missing `\boxed{{}}`.)
|
||||
>
|
||||
> **REMINDERS:**
|
||||
> - You cannot deploy on an occupied orbit.
|
||||
> - The game will end immediately if three satellites align or all nine orbits are filled.
|
||||
|
||||
All dialogue and moves are appended to the shared `observation_log`.
|
||||
|
||||
---
|
||||
|
||||
## 10. API Mapping Plan
|
||||
|
||||
| API Method | Purpose | Primary Read/Write | Terminal logic |
|
||||
|-------------|----------|-------------------|----------------|
|
||||
| `reset(seed)` | Initializes the grid, assigns symbols, clears logs, and sets starting player. | Writes entire `game_state`. | Returns initial observation and seed confirmation. |
|
||||
| `step(action)` | Validates player’s boxed action, updates the grid/state, switches turns. | Reads `current_player`, `board`; writes updates, logs. | Runs win/draw checks after every move; sets `is_terminal`, `winner`. |
|
||||
| `_generate_player_prompt(player)` | Builds textual prompt shown above, embedding the latest board and prior logs. | Reads from `board`, `observation_log`, and `current_player`. | Does not modify state; only generates text. |
|
||||
|
||||
On invalid actions, `step` calls `set_invalid_move(reason)` and forces a retry or ends the game if hopeless.
|
||||
|
||||
---
|
||||
|
||||
## 11. Copy-Check Against Example
|
||||
|
||||
All entity names (**Commander Solis**, **Commander Nyx**, **satellites**, **orbital grid**) and thematic terms are **original** and unrelated to any example negotiation or deal-making scenario. The game’s objective (aligning satellites on a 3×3 grid) derives from *tic-tac-toe mechanics* but expressed in a wholly new narrative context.
|
||||
All `game_state` keys (`board`, `winner`, `observation_log`, `symbol`, etc.) are unique to *Orbital Align*, and none are borrowed from any trading, diplomacy, or economic system.
|
||||
Reference in New Issue
Block a user