Files
alrightalrightalright-v0/environment.md
2001-01-01 00:00:00 +00:00

8.1 KiB
Raw Blame History

Turn-Based TextArena Design Document: "GlyphGrid Duel"

(Design document for a deterministic, turn-based environment inspired by tic-tac-toe mechanics, but set in a completely original setting, terminology, and data schema.)


1. Concept Paragraph

Game Title: GlyphGrid Duel

In the ancient halls of the Archivists, two rival Scribes compete to inscribe mystical glyphs into a sacred 3×3 grid called the “Runeboard.” Each Scribe alternates turns to etch one of their signature glyphs—Solar (S) or Lunar (L)—into an empty rune slot. The goal is to align three of ones glyphs consecutively across a row, column, or diagonal, representing mastery of the grids equilibrium energies. Although the core structure echoes a placement strategy game, GlyphGrid Duel is unrelated to any negotiation, trade, or dialogue-based environment. It focuses solely on deterministic pattern control, tactical foresight, and spatial reasoning.


2. Roles and Win Condition

  • Players: Two players:

    • Scribe Solar — uses the glyph "S".
    • Scribe Lunar — uses the glyph "L".
  • Objective: Align three identical glyphs in a straight line across the 3×3 Runeboard.

  • Win Condition:

    • A player wins immediately upon creating a line (horizontal, vertical, or diagonal) consisting of their own glyphs.
    • If all cells are filled and no player has a line, the result is a Draw.
  • Loss Condition:

    • A player loses if the opponent achieves a winning alignment first.
  • Draw Condition:

    • The Runeboard is full, and no completed line exists.

3. Turn Structure and Determinism

  • The game alternates turns between Scribe Solar (first) and Scribe Lunar (second).
  • Each turn consists of one valid placement action onto an empty cell.
  • Turn Limit: 9; the grid contains 9 total rune slots.
  • Determinism:
    • No random factors after initialization.
    • A fixed random seed controls any starting player choice (though Solar always starts by default) and can reproduce identical outcomes when applied in reset(seed=x).

4. Action Grammar (Machine-Parseable)

Valid Actions

Each action specifies a cell position in row-column format, using 1-based indexing.

Format:

[Etch: <row>, <column>]
  • <row> and <column> are integers in {1, 2, 3}.
  • The cell at (row, column) must be unoccupied.

Regex pattern:

^\[Etch:\s*([1-3]),\s*([1-3])\]$

Examples:

Example Action Valid? Reason
[Etch: 1, 3] Valid coordinates.
[Etch: 3, 1] Valid coordinates.
[Etch: 4, 2] Row = 4 out of bounds.
[Etch (2,2)] Invalid token format (missing colon and brackets).
[Mark: 1, 1] Invalid verb token; must use “Etch”.

All player responses must be wrapped in \boxed{{}} during gameplay, e.g.
\boxed{{[Etch: 2, 1]}}.


5. Game State Schema

Example game_state at runtime (illustrative values only):

{
  "runeboard": [
    ["S", "L", "_"],
    [ "_", "S", "_"],
    ["L", "_", "L"]
  ],
  "current_player": "Solar",
  "turn_count": 5,
  "winner": null,
  "is_terminal": false,
  "last_action": "[Etch: 3, 3]",
  "observations": {
    "Solar": [
      "Runeboard state after turn 4...",
      "Lunar etched at (3,1)"
    ],
    "Lunar": [
      "Runeboard state after turn 4...",
      "Lunar etched at (3,1)"
    ]
  },
  "player_symbols": {
    "Solar": "S",
    "Lunar": "L"
  },
  "seed": 42
}

Keys:

  • runeboard — Nested list of strings ("S", "L", or "_" for empty).
  • current_player — Indicates whose turn it is.
  • turn_count — Number of turns completed.
  • winner"Solar", "Lunar", or null.
  • is_terminal — Boolean indicating game completion.
  • last_action — Last validated [Etch: r, c].
  • observations — Per-player transcript and board updates.
  • player_symbols — Maps each player to their glyph.
  • seed — Ensures deterministic reproducibility.

6. Initialization Rules

  • On reset(seed), seed is recorded in game_state["seed"].
  • Starting player defaults to Solar unless a rule toggle changes it (seed-dependent optional).
  • Board resets to all empty ("_").
  • turn_count = 0, winner = null, is_terminal = false.
  • Initial observation describes the empty Runeboard:
    The Runeboard is empty. Each Scribe may etch a glyph using [Etch: row, col].
    
  • All randomness (if ever expanded, e.g. random first player) must derive solely from the seed.

7. Validation and Error Handling

When extracting the inner content via _extract_answer_content, the environment validates:

  1. Regex pattern matches ^\[Etch:\s*([1-3]),\s*([1-3])\]$.
  2. Target cell must be empty ("_").
  3. Game must not be terminal.
  4. The acting player must match current_player.

Invalid Move Reasons (passed to set_invalid_move):

  • "Invalid format: must be [Etch: row, column] with row,col in 13."
  • "Out of bounds: coordinates must be between 1 and 3."
  • "Cell already occupied."
  • "Game already ended."
  • "Not your turn."

8. Terminal Conditions and Scoring

At the end of each valid move:

  1. Win Check:

    • If current players glyph forms any contiguous row, column, or diagonal of 3 identical glyphs,
      winner = current_player, is_terminal = True.
  2. Draw Check:

    • If turn_count == 9 and winner == null,
      is_terminal = True, result = Draw.
  3. Scoring:

    • Win: +1 point to winner; 0 to loser.
    • Draw: 0.5 to both.
  4. Tie Break:

    • None; draws are final.

9. Player Prompt Specification

Each turns _generate_player_prompt(player_id) provides:

  1. Identity Blurb:

    You are a Scribe competing to master the Runeboard through glyph alignment.
    
  2. Rules Summary:

    • Each player alternately etches one glyph per turn.
    • Wins occur when three identical glyphs align (row, column, or diagonal).
    • If all nine cells are filled without alignment, its a draw.
  3. Action Instructions:

    • Choose one empty cell and etch your glyph.
    • Actions must follow the format [Etch: row, column].
    • Place your final choice inside \boxed{{}}.
  4. Examples:

    Example valid response:
    I will etch at the top right corner.
    \boxed{{[Etch: 1, 3]}}
    
    Example invalid response:
    \boxed{{[Mark: 1, 3]}}   # Reason: "Mark" is not a valid action.
    
  5. Information Provided Each Turn:

    • Current Runeboard state.
    • Move history (summarized from observations).
    • Which coordinates are still empty.

10. API Mapping Plan

reset(seed)

  • Initializes game_state with all keys defined above.
  • Creates empty Runeboard and resets counters.
  • Sets seed for deterministic reproduction.
  • Returns initial observation for both players describing the empty grid.

step(player_id, action)

  • Extracts content = _extract_answer_content(action).
  • Validates format and move legality.
  • Updates runeboard, turn_count, current_player.
  • Checks for terminal condition (win or draw).
  • Records action in both players observations list.
  • Returns updated game_state, per-player observations, reward signals, and termination status.

_generate_player_prompt(player_id)

  • Produces textual prompt combining:
    • Role context (Solar/Lunar)
    • Current Runeboard depiction
    • Legal moves list in [Etch: r, c] format
    • Reminder of boxed answer format and examples
  • Enforces output rule:
    "Put your final answer within \boxed{{}} at the end of your response."

11. Copy-Check Against the Example

This design:

  • Does not reference or replicate the negotiation examples mechanics, dialogue, or resources.
  • Uses completely distinct terminology: Scribes, Glyphs, Runeboard, Etching.
  • Involves no negotiation, trade, or communication mechanics.
  • Defines an objective (line alignment) wholly original to this document.
  • All game_state keys (runeboard, current_player, player_symbols, etc.) and prompts are original to GlyphGrid Duel.

End of Design Document for “GlyphGrid Duel.”