diff --git a/environment.md b/environment.md new file mode 100644 index 0000000..9398abb --- /dev/null +++ b/environment.md @@ -0,0 +1,231 @@ +# **TextArena Game Design Document: “Runic Grid”** + +*(Design stage only – no code yet)* +This design invents an original game, **Runic Grid**, inspired structurally by turn-based spatial strategy but **completely distinct** in theme, background, and terminology from any tic-tac-toe or negotiation examples. + +--- + +## 1. Concept Paragraph + +In **Runic Grid**, two rival mystics, the **Solar Scribe** and the **Lunar Scribe**, compete to inscribe magical runes onto a sacred 3×3 **Runic Tablet**. Each cell of the Tablet can bear only one rune, either a **Sun Rune (☼)** or a **Moon Rune (☽)**. The goal is to channel divine alignment by creating a continuous triad of your own rune horizontally, vertically, or diagonally before your opponent does. This design has no themes of trade, debate, or negotiation—it is a deterministic puzzle duel of pattern claiming. The core actions are **[Inscribe:x,y]**, where each player selects a tile to mark on their turn. + +--- + +## 2. Roles and Win Condition + +- **Players**: + - Player 1: *Solar Scribe* (places **Sun Runes, ☼**) + - Player 2: *Lunar Scribe* (places **Moon Runes, ☽**) + +- **Win Condition**: + A player wins immediately upon forming three of their runes in a continuous line—horizontal, vertical, or diagonal—on the Runic Tablet. + +- **Loss Condition**: + The opponent achieves a triad alignment first. + +- **Draw Condition**: + All nine cells are occupied with no winning triad. + +--- + +## 3. Turn Structure and Determinism + +- Turns alternate between Solar and Lunar Scribe starting with Solar Scribe. +- One action per turn. +- Deterministic: No randomness; the entire environment is reproducible from the same sequence of moves and initial seed. +- Maximum turn count: **9** (since there are 9 cells). +- Each step is deterministic; random seed affects only initial player ordering if randomized (optional). + +--- + +## 4. Action Grammar (Machine-Parseable) + +**Allowed Action Format**: +Each player submits an action inside `\boxed{{}}`. +Inside the braces, the content must follow one of the following patterns: + +### 4.1 `[Inscribe:x,y]` + +Places the player's rune at coordinates `(x, y)` on the 3×3 Runic Tablet. + +- **Pattern**: + ``` + ^\[Inscribe:(?:[0-2]),(?:[0-2])\]$ + ``` + where `x` and `y` are integers 0, 1, or 2 representing row and column indexes. + +- **Interpretation**: + (0,0) = top-left, (2,2) = bottom-right. + +- **Example Valid Action**: + ``` + [Inscribe:1,2] + ``` + → Marks the middle-right cell. + +- **Example Invalid Action**: + ``` + [draw:1,2] + ``` + → Invalid because the verb “draw” does not match required token “Inscribe”. + +Another invalid example: + ``` + [Inscribe:3,0] + ``` + → Invalid because coordinate 3 is outside permitted range 0–2. + +**Only one grammar token exists**; the rest of the system enforces availability, legality, and turn order. + +--- + +## 5. Game State Schema + +Full run-time `game_state` example: + +```json +{ + "turn_count": 4, + "current_player": "Solar Scribe", + "board": [ + ["☼", "☽", null], + [null, "☼", null], + ["☽", null, null] + ], + "players": { + "Solar Scribe": { + "symbol": "☼", + "actions": ["[Inscribe:0,0]", "[Inscribe:1,1]"] + }, + "Lunar Scribe": { + "symbol": "☽", + "actions": ["[Inscribe:0,1]", "[Inscribe:2,0]"] + } + }, + "winner": null, + "outcome": "ongoing", + "observations": [ + {"player": "Solar Scribe", "action": "[Inscribe:0,0]"}, + {"player": "Lunar Scribe", "action": "[Inscribe:0,1]"}, + {"player": "Solar Scribe", "action": "[Inscribe:1,1]"}, + {"player": "Lunar Scribe", "action": "[Inscribe:2,0]"} + ] +} +``` + +--- + +## 6. Initialization Rules + +- The board initializes as a 3×3 matrix of `null`. +- Turn count = 0. +- `current_player` defaults to *Solar Scribe*, unless overridden by a deterministic seed-controlled random choice. +- No randomness affects board contents. +- `game_state["observations"]` starts empty. +- At reset, the first observation returned describes player identities, symbol assignments, and available actions. + +--- + +## 7. Validation and Error Handling + +On each step, before updating the state: + +1. **Extract content** between `\boxed{{}}` using `_extract_answer_content(action: str) → str`. + If missing or malformed, mark move invalid with reason `"Malformed boxed syntax"`. + +2. **Check Regex Pattern**: + Must match `^\[Inscribe:[0-2],[0-2]\]$`. + If not, invalid with reason `"Action does not match grammar [Inscribe:x,y]"`. + +3. **Check Turn Ownership**: + If action comes from incorrect player, invalid with reason `"Action out of turn"`. + +4. **Check Tile Availability**: + If the chosen cell is already marked, invalid with reason `"Tile already inscribed"`. + +Invalid actions trigger `set_invalid_move(player, reason)` and the turn passes without board change. + +--- + +## 8. Terminal Conditions and Scoring + +**Terminal Checks each turn:** + +1. **Triad Alignment**: + After a valid inscription, check all possible lines for three identical rune symbols belonging to the acting player. + - If true → `winner = that player`, `outcome = "win"`. +2. **Draw**: + If the board is full (`9 moves`) and no winner → `outcome = "draw"`. +3. **Otherwise** → `outcome = "ongoing"`. + +**Scoring Rules**: +- Win: +1 point to the winner, 0 to loser. +- Draw: 0 points both sides. + +No fractional or cumulative scoring beyond the single session. + +**Tie-Breaker**: None; draw is final. + +--- + +## 9. Player Prompt Specification + +### Prompt Outline for `_generate_player_prompt` + +Identity: +> You are a **mystic scribe** engraving the sacred **Runic Tablet** to align divine energies. +> On your turn, you use the `[Inscribe:x,y]` action to claim an empty slot on the 3×3 grid. +> Your goal: form a continuous line of your rune symbols before your opponent does. + +Rules Recap: +- Solar Scribe marks **☼**, Lunar Scribe marks **☽**. +- Valid format: `\boxed{{[Inscribe:x,y]}}` where `x` and `y` are 0–2. +- You may not inscribe an already-occupied tile. +- Example grid coordinate map: + ``` + (0,0) (0,1) (0,2) + (1,0) (1,1) (1,2) + (2,0) (2,1) (2,2) + ``` + +Response Format Requirement: +- Provide any reasoning text first. +- Conclude with your action token within `\boxed{{}}`. + +### Few-Shot Examples + +``` +Example valid response: +I shall inscribe my rune on the center tile for strength. +\boxed{{[Inscribe:1,1]}} + +Example invalid response: +I think I will go middle-right. +\boxed{{Move:1,2}} +``` + +Ensure every response ends with exactly one `\boxed{{}}` block containing the action token. + +--- + +## 10. API Mapping Plan + +| Method | Purpose | Reads / Writes | Outcome Handling | +|---------|----------|----------------|------------------| +| **`reset(seed)`** | Initializes `game_state`, applies seed (e.g., random start player if supported), returns initial observation of empty board and player roles. | Sets `turn_count=0`, clears `observations`, resets `winner/outcome`. | Returns observation for both players with board layout and available format instructions. | +| **`step(player_id, action)`** | Processes player’s action each turn, updates `board`, verifies legality, appends to `observations`, checks for terminal conditions. | Reads current board, writes updated cell, increments `turn_count`, flips `current_player`. | Returns updated observation, `done` flag if terminal, and optional reward dictionary `{+1,0}`. | +| **`_generate_player_prompt(player_id)`** | Constructs human-readable prompt describing game status, coordinates, runes, and legal actions. | Reads entire `game_state` to present board, current turn, valid syntax. | Ensures prompt ends with instruction: *“Put your final answer within \boxed{{}} at the end of your response.”* | + +--- + +## 11. Copy-Check Against Example + +- **Theme Check**: Distinct from negotiation example—no trading, communication, or offers. +- **Objective Check**: Pattern-alignment on a Runic Tablet vs. any negotiation outcomes. +- **Terminology**: Original names—*Runic Tablet*, *Solar Rune*, *Lunar Rune*, *Inscribe*—entirely unique. +- **Game state keys**: `"board"`, `"players"`, `"observations"`, `"turn_count"`, `"winner"`, `"outcome"` are original to this schema. +- **Prompt and grammar**: Use custom `[Inscribe:x,y]` format unseen in example sources. + +--- + +**End of Design Document for “Runic Grid”** \ No newline at end of file