From a3fe4321df72cc2f66ce73fb8e9b28cc60eef66a Mon Sep 17 00:00:00 2001 From: Openverse Builder Date: Mon, 1 Jan 2001 00:00:00 +0000 Subject: [PATCH] Add environment documentation from Openverse builder --- environment.md | 193 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 193 insertions(+) create mode 100644 environment.md diff --git a/environment.md b/environment.md new file mode 100644 index 0000000..9e686e2 --- /dev/null +++ b/environment.md @@ -0,0 +1,193 @@ +# **Game Design Document: “Orbital Align” (Deterministic Turn-Based Strategy Inspired by Tic-Tac-Toe)** + +--- + +## 1. Concept Paragraph + +**Setting & Theme:** +In *Orbital Align*, two rival star captains compete to align their fleets of orbital satellites across a 3×3 planetary grid suspended around a dying star. Unlike classic tic-tac-toe, this version reimagines the board as orbital nodes where each satellite placement represents a strategic claim of spatial control. The goal is to align three satellites in a row—horizontally, vertically, or diagonally—before the opponent does. +**Core action tokens:** +`[Deploy:x,y]` (to place a satellite on coordinates), and `[Scan]` (forfeit placement to reveal the current grid state). +This design is *completely unrelated* to any previous negotiation or resource trading example. It uses a new setting, terminology, and objectives. + +--- + +## 2. Roles and Win Condition + +**Roles:** +- **Player A (Commander Solis)** and **Player B (Commander Nyx)** each command a distinct orbital fleet. +- Each player’s satellite is marked distinctly (`S` for Solis, `N` for Nyx`). + +**Win Condition:** +- A player wins if they align **three of their satellites** consecutively in any row, column, or diagonal. +- If all nine grid cells are filled without a winning alignment, the result is a **draw**. + +**Loss Condition:** +- A player loses if the opponent achieves an alignment before them. +- A player also loses immediately if they perform an **invalid action** that cannot be corrected within the same turn. + +--- + +## 3. Turn Structure and Determinism + +- The game progresses **alternating turns**, starting with Commander Solis (Player A). +- **Each turn**: Current player chooses one action (`Deploy` or `Scan`). +- Maximum **turn limit**: 9 (the grid has 9 total cells). +- The environment uses a reproducible **random seed**—though this game itself has no stochastic actions, seeding ensures deterministic ordering if future extensions add random elements. + +--- + +## 4. Action Grammar (Machine-Parseable) + +**Permitted Action Tokens** + +| Action | Meaning | Formal Regex | Example Valid | Example Invalid | Reason Invalid | +|:--|:--|:--|:--|:--|:--| +| `[Deploy:x,y]` | Place a satellite at coordinates (x,y) where x,y ∈ {1,2,3} | `^\[Deploy:(?:[1-3]),(?:[1-3])\]$` | `[Deploy:2,3]` | `[Deploy:4,1]` | 4 outside valid range | +| `[Scan]` | View the current orbital grid instead of placing | `^\[Scan\]$` | `[Scan]` | `[ScanGrid]` | Incorrect token name | + +**Rules:** +- Coordinates (x,y) correspond to the grid: (1,1) = top-left, (3,3) = bottom-right. +- No double occupation allowed—if a player tries to `Deploy` on an occupied node, it is invalid. + +--- + +## 5. Game State Schema + +Example serialized game state: + +```json +{ + "turn_count": 5, + "current_player": "Commander Solis", + "board": [ + ["S", "N", " "], + [" ", "S", " "], + ["N", " ", " "] + ], + "players": { + "Commander Solis": { + "symbol": "S", + "actions_taken": ["[Deploy:1,1]", "[Deploy:2,2]", "[Deploy:3,1]"] + }, + "Commander Nyx": { + "symbol": "N", + "actions_taken": ["[Deploy:1,2]", "[Deploy:3,1]"] + } + }, + "winner": null, + "is_terminal": false, + "last_action": "[Deploy:2,2]", + "observation_log": [ + "Commander Solis deployed to 1,1", + "Commander Nyx deployed to 1,2", + "Commander Solis deployed to 2,2" + ], + "seed": 42 +} +``` + +--- + +## 6. Initialization Rules + +- **Board**: Empty 3×3 grid represented as a list of lists containing `" "`. +- **Starting player**: Commander Solis always starts. +- **Seeding**: Random seed (e.g., `seed=42`) stored in `game_state` for deterministic replay. +- **Onboarding observations**: + Upon `reset`, each player receives: + - The empty grid state. + - Instructions on how to deploy satellites and when the game concludes. + +--- + +## 7. Validation and Error Handling + +**Validation checks in order:** +1. Verify that the extracted content matches one of the valid action patterns. +2. For `[Deploy:x,y]`, ensure: + - x, y within range 1–3. + - Target cell is empty. +3. For `[Scan]`, ensure no other content is appended. +4. If the regex or move legality fails, call + `set_invalid_move(player, reason)` + with one of: + - `"Malformed action syntax"` + - `"Coordinates out of range"` + - `"Target cell occupied"` + - `"Unrecognized action token"` + +Action extraction must strip wrapping `\boxed{{...}}`, leaving only the internal content for validation. + +--- + +## 8. Terminal Conditions and Scoring + +**After each move**, the system checks: + +1. **Win Check:** + - Rows, columns, and diagonals scanned for `['S', 'S', 'S']` or `['N', 'N', 'N']`. + - The corresponding player is marked `winner`. +2. **Draw Check:** + - If `turn_count == 9` and no winner ⇒ `"DRAW"`. +3. **Score Rules:** + - Winner = 1, Loser = 0. + - In draw = 0.5 each. + +Tie-breakers are deterministic—no randomness or hidden state. + +--- + +## 9. Player Prompt Specification + +**Prompt Outline:** + +> **IDENTITY BLURB:** +> You are a star commander controlling a fleet of satellites orbiting a dying star. Your mission is to align three of your satellites in a row across the 3×3 orbital grid before your rival does. +> +> **CURRENT STATE:** +> - The board shows your placements (S) and your opponent’s (N). +> - Empty cells are blank spaces. +> +> **AVAILABLE ACTIONS:** +> - `[Deploy:x,y]` → Place your satellite at coordinates (x,y) where x,y ∈ {1,2,3}. +> - `[Scan]` → Forfeit placement this turn to inspect the full orbital map. +> +> **FORMAT RULES:** +> - Each response must end with: `\boxed{{}}` +> - Example of valid response: +> ``` +> I will secure the top-right orbit next. +> \boxed{{[Deploy:1,3]}} +> ``` +> - Example of invalid response: +> ``` +> Let’s attack next time. +> [Deploy:1,3] +> ``` +> (Because it's missing `\boxed{{}}`.) +> +> **REMINDERS:** +> - You cannot deploy on an occupied orbit. +> - The game will end immediately if three satellites align or all nine orbits are filled. + +All dialogue and moves are appended to the shared `observation_log`. + +--- + +## 10. API Mapping Plan + +| API Method | Purpose | Primary Read/Write | Terminal logic | +|-------------|----------|-------------------|----------------| +| `reset(seed)` | Initializes the grid, assigns symbols, clears logs, and sets starting player. | Writes entire `game_state`. | Returns initial observation and seed confirmation. | +| `step(action)` | Validates player’s boxed action, updates the grid/state, switches turns. | Reads `current_player`, `board`; writes updates, logs. | Runs win/draw checks after every move; sets `is_terminal`, `winner`. | +| `_generate_player_prompt(player)` | Builds textual prompt shown above, embedding the latest board and prior logs. | Reads from `board`, `observation_log`, and `current_player`. | Does not modify state; only generates text. | + +On invalid actions, `step` calls `set_invalid_move(reason)` and forces a retry or ends the game if hopeless. + +--- + +## 11. Copy-Check Against Example + +All entity names (**Commander Solis**, **Commander Nyx**, **satellites**, **orbital grid**) and thematic terms are **original** and unrelated to any example negotiation or deal-making scenario. The game’s objective (aligning satellites on a 3×3 grid) derives from *tic-tac-toe mechanics* but expressed in a wholly new narrative context. +All `game_state` keys (`board`, `winner`, `observation_log`, `symbol`, etc.) are unique to *Orbital Align*, and none are borrowed from any trading, diplomacy, or economic system. \ No newline at end of file