--- # **TIC-TAC-TRAIL: A Turn-Based Strategy Design Document** --- ## **1. Concept Paragraph** **Concept Overview:** *Tic-Tac-Trail* is a deterministic, turn-based tactical puzzle inspired by grid conquest—completely unrelated to negotiation or trade mechanics. Two explorers, **Team Sun** and **Team Moon**, compete to claim paths on an ancient 3×3 stone map. Each tile can be marked with their emblem (`Sun` or `Moon`). The first expedition to align three of their emblems in a continuous line (horizontal, vertical, or diagonal) awakens the temple’s power and wins. Core player commands are expressed as `[Mark:,]`, describing which grid position to claim, or `[Pass]` if no legal move remains. The environment tracks placement, board state, turn order, and victory conditions deterministically. --- ## **2. Roles and Win Condition** - **Players:** - Player 1: *Team Sun* (symbol “S”) - Player 2: *Team Moon* (symbol “M”) - **Objective:** Align three of one’s symbols (`S` or `M`) in a straight line (row, column, or diagonal) before the board fills. - **Decision Rules:** - **Win:** First player to form an unbroken trio of their own emblem. - **Loss:** Opponent achieves a trio first. - **Draw:** All nine tiles filled without a winning alignment. - Once a win or draw occurs, the game becomes terminal and no further moves are accepted. --- ## **3. Turn Structure and Determinism** - The game alternates turns strictly: *Sun → Moon → Sun → Moon*, and so on. - Turn count begins at 1 and increments after each valid action. - Maximum of nine turns (since there are nine cells). - No random factors exist; the game is **fully deterministic**. - Seed value (for reproducibility) is still stored in state, but unused—ensuring consistent replay. --- ## **4. Action Grammar (Machine-Parseable)** **Permitted Actions:** ### 4.1 Mark a Tile - **Token Format:** `[Mark:,]` - **Pattern (regex):** `^\[Mark:(0|1|2),(0|1|2)\]$` - **Semantics:** Current player places their symbol on the specified cell `(row, col)` if it’s empty. **Examples:** - ✅ **Valid:** `[Mark:0,2]` — Player marks top-right cell. - ❌ **Invalid:** `[Mark:3,1]` — Row "3" out of range (valid rows: 0–2). - ❌ **Invalid:** `[Mark:1-2]` — Comma separator or keyword missing. ### 4.2 Pass - **Token Format:** `[Pass]` - **Pattern (regex):** `^\[Pass\]$` - **Semantics:** Used only if the player has no valid cell remaining (rare in tic-tac-toe). **Examples:** - ✅ **Valid:** `[Pass]` — Player skips turn. - ❌ **Invalid:** `[PASS]` — Case-sensitive token must match exactly `[Pass]`. --- ## **5. Game State Schema** ```json { "seed": 42, "turn_count": 1, "current_player": "Sun", "board_state": [ ["_", "_", "_"], ["_", "_", "_"], ["_", "_", "_"] ], "player_symbols": { "Sun": "S", "Moon": "M" }, "history": [ {"player": "System", "message": "The ancient board awaits."} ], "winner": null, "status": "ongoing", "available_moves": [ [0, 0], [0, 1], [0, 2], [1, 0], [1, 1], [1, 2], [2, 0], [2, 1], [2, 2] ], "scores": { "Sun": 0, "Moon": 0 } } ``` - Keys reflect a unique thematic world: the ancient “trail” board, emblems for “Sun” and “Moon,” and clear distinction from any negotiation-like schema. --- ## **6. Initialization Rules** - When `reset(seed)` is called: 1. The RNG is seeded (though unused for determinism) using `seed`. 2. The `board_state` is filled with `_` symbols representing empty stone tiles. 3. The first turn is always `Sun`. 4. The `history` log begins with a world description. 5. `available_moves` includes all `(row, col)` pairs. - Observation: Both players receive identical initial description and empty board visualization. --- ## **7. Validation and Error Handling** - **Extraction:** The environment will extract content from within `\boxed{{}}` using `_extract_answer_content(action)`. - **Validation Steps:** 1. Verify action string matches one of the two regex patterns. 2. If `[Mark:,]`, check: - 0 ≤ r,c ≤ 2 - Corresponding cell is unoccupied (`"_"`). 3. If `[Pass]`, ensure no playable cells remain; otherwise invalid. - **Invalid Reasons (examples):** - "Invalid format — must be [Mark:r,c] or [Pass]." - "Chosen cell already occupied." - "Row or column index out of range." - "Cannot pass while moves still available." If invalid, the system invokes `set_invalid_move(reason)` and forfeit logic may apply depending on higher-level controller. --- ## **8. Terminal Conditions and Scoring** **Checks performed after each valid move:** 1. **Win Check:** If the current player owns three symbols aligned horizontally, vertically, or diagonally: - `winner = current_player` - `status = "finished"` - `scores[current_player] = 1` - Opponent receives 0. 2. **Draw Check:** If all cells filled and no winner: - `winner = null` - `status = "draw"` - Both scores = 0.5. 3. **Continue Otherwise:** - `status = "ongoing"` - Proceed to next player. **Tie-Break:** None beyond declared draw; equal scoring applies. --- ## **9. Player Prompt Specification** **Prompt Identity and Instructions:** Each turn’s prompt should contain: 1. A brief world intro: “You are an explorer representing Team Sun (or Team Moon) claiming tiles on the ancient Tic-Tac-Trail.” 2. The current board visualization (3×3 grid of `_`, `S`, `M`). 3. The list of allowed action formats: - `[Mark:,]` where `` and `` are integers 0–2. - `[Pass]` if no unclaimed tiles remain. 4. Reminder of victory condition: “Align three of your emblems in a straight line.” 5. Rule reminder: “All actions must be enclosed in `\boxed{{}}` at the end of your message.” **Few-shot examples:** ``` Example valid response: I should take the center stone before my rival. \boxed{{[Mark:1,1]}} ``` ``` Example invalid response (wrong format): \boxed{{Mark:1,1}} <-- Missing brackets [ ] ``` ``` Example valid response (board full, passing): No moves left, I will pass. \boxed{{[Pass]}} ``` **Extraction Function Notice:** `_extract_answer_content(self, action: str) -> str` will strip `\boxed{{}}` syntax and return internal content for validation. --- ## **10. API Mapping Plan** | Method | Purpose | Operations on Game State | Output | |--------|----------|--------------------------|--------| | **`reset(seed)`** | Initialize the game | Sets all keys per schema, seed board, assign first player (`Sun`), populate `available_moves`, generate initial system message | Returns initial `observations` for both players | | **`step(player_action)`** | Process one player's move | 1. Extract content with `_extract_answer_content` 2. Validate grammar & legality 3. If valid, apply to `board_state` 4. Append to `history` 5. Update `available_moves` 6. Check win/draw conditions, adjust scores, and advance turn | Returns updated `observations`, reward info, `done` flag | | **`_generate_player_prompt(player_id)`** | Builds textual context for that player | Uses the current `board_state`, `turn_count`, and list of legal actions. Demonstrates correct formatting. | Returns formatted prompt string instructing the player to end with a `\boxed{{}}` action | --- ## **11. Copy-Check Against the Example** This design is **fully distinct** from any negotiation or resource-trading environment. - **Theme:** Archaeological puzzle arena (grid conquest), not negotiation. - **Objectives:** Claim territory and form a line, not reach mutual agreements. - **Entities:** Ancient stones, Sun and Moon symbols—not participants in a deal. - **Game State Keys:** `board_state`, `player_symbols`, `available_moves`, and `scores`—entirely original. - **Prompt Text:** References *Tic-Tac-Trail* and ancient exploration, not disputes or offers. Therefore, this specification represents a fully self-contained original turn-based environment for a deterministic **tic-tac-toe–style strategy challenge**, compliant with TextArena architecture. ---