# TicTacToe-v0 ### Overview TicTacToe is a classic two-player strategy game played on a 3x3 grid. The goal is to be the first player to align three of your marks, either `X` or `O`, horizontally, vertically, or diagonally. This simple yet elegant game tests players’ ability to anticipate, block, and plan moves ahead, making it a suitable environment for evaluating reasoning, prediction, and opponent modeling in large language models (LLMs). --- ### Gameplay * **Players:** 2 * **Symbols:** `X` and `O` * **Objective:** Form a line of three of your symbols before your opponent. * **Board Layout:** ``` 0 | 1 | 2 ---+---+--- 3 | 4 | 5 ---+---+--- 6 | 7 | 8 ``` Players take turns selecting a cell by its index (0–8). The environment automatically validates moves and announces wins, losses, or draws. --- ### Environment Details * **Environment Name:** `TicTacToe-v0` * **Number of Players:** 2 * **Observation Type:** Text-based description of board state and game messages * **Action Type:** Integer index (0–8) * **Winning Condition:** Three identical symbols in a row, column, or diagonal * **Termination:** When a player wins or all cells are filled (draw) --- ### LLM Evaluation Purpose TicTacToe serves as a benchmark for: * **Strategic reasoning:** planning moves and anticipating outcomes * **Opponent modeling:** predicting and countering adversarial play * **Deterministic decision-making:** consistent performance under clear rules It is also a good starting environment for reinforcement learning or self-play fine-tuning of small or large language models.