
提供强化学习后训练的标准环境接口库
RLHF热潮下,标准化后训练环境接口需求增长;Hugging Face背书吸引关注
为国内RLHF实践提供标准化环境接口,降低后训练集成成本
适合强化学习后训练(如RLHF)的环境搭建与接口适配
An e2e framework for creating, deploying and using isolated execution environments for agentic RL training, built using Gymnasium style simple APIs.
Featured Example: Train LLMs to play BlackJack using torchforge (PyTorch's agentic RL framework): examples/grpo_blackjack/
Zero to Hero Tutorial: End to end tutorial from our GPU Mode lecture and other hackathons.
Install the OpenEnv package:
pip install openenv
Install an environment client (e.g., Echo):
pip install git+https://huggingface.co/spaces/openenv/echo_env
Then use the environment:
import asyncio
from echo_env import CallToolAction, EchoEnv
async def main():
# Connect to a running Space (async context manager)
async with EchoEnv(base_url="https://openenv-echo-env.hf.space") as client:
# Reset the environment
result = await client.reset()
print(result.observation.echoed_message) # "Echo environment ready!"
# Send messages
result = await client.step(
CallToolAction(
tool_name="echo_message",
arguments={"message": "Hello, World!"},
)
)
print(result.observation.result) # "Hello, World!"
print(result.reward)
asyncio.run(main())
Synchronous usage is also supported via the .sync() wrapper:
from echo_env import CallToolAction, EchoEnv
# Use .sync() for synchronous context manager
with EchoEnv(base_url="https://openenv-echo-env.hf.space").sync() as client:
result = client.reset()
result = client.step(
CallToolAction(
tool_name="echo_message",
arguments={"message": "Hello, World!"},
)
)
print(result.observation.result)
For a detailed quick start, check out the docs page.
OpenEnv provides a standard for interacting with agentic execution environments via simple Gymnasium style APIs - step(), reset(), state(). Users of agentic execution environments can interact with the environment during RL training loops using these simple APIs.
In addition to making it easier for researchers and RL framework writers, we also provide tools for environment creators making it easier for them to create richer environments and make them available over familiar protocols like HTTP and packaged using canonical technologies like docker. Environment creators can use the OpenEnv framework to create environments that are isolated, secure, and easy to deploy and use.
The OpenEnv CLI (openenv) provides commands to initialize new environments and deploy them to Hugging Face Spaces.
⚠️ Early Development Warning OpenEnv is currently in an experimental stage. You should expect bugs, incomplete features, and APIs that may change in future versions. The project welcomes bugfixes, but significant changes should be discussed before implementation so the technical committee and community can coordinate scope, compatibility, and release timing. It's recommended that you signal your intention to contribute in the issue tracker, either by filing a new issue or by claiming an existing one.
Below is a list of active and historical RFCs for OpenEnv. RFCs are proposals for major changes or features. Please review and contribute!
┌─────────────────────────────────────────────────────────┐
│ Client Application │
│ ┌────────────────┐ ┌──────────────────┐ │
│ │ EchoEnv │ │ CodingEnv │ │
│ │ (EnvClient) │ │ (EnvClient) │ │
│ └────────┬───────┘ └────────┬─────────┘ │
└───────────┼───────────────────────────────┼─────────────┘
│ WebSocket │ WebSocket
│ (reset, step, state) │
┌───────────▼───────────────────────────────▼─────────────┐
│ Docker Containers (Isolated) │
│ ┌──────────────────────┐ ┌──────────────────────┐ │
│ │ FastAPI Server │ │ FastAPI Server │ │
│ │ EchoEnvironment │ │ PythonCodeActEnv │ │
│ │ (Environment base) │ │ (Environment base) │ │
│ └──────────────────────┘ └──────────────────────┘ │
└─────────────────────────────────────────────────────────┘
OpenEnv includes a built-in web interface for interactive environment exploration and debugging. The web interface provides:
The web interface is conditionally enabled based on environment variables:
ENABLE_WEB_INTERFACE=trueTo use the web interface:
from openenv.core.env_server import create_web_interface_app
from your_env.models import YourAction, YourObservation
from your_env.server.your_environment import YourEnvironment
env = YourEnvironment()
app = create_web_interface_app(env, YourAction, YourObservation)
When enabled, open http://localhost:8000/web in your browser to interact with the environment.
Base class for implementing environment logic:
reset(): Initialize a new episode, returns initial Observationstep(action): Execute an Action, returns resulting Observationstate(): Access episode metadata (State with episode_id, step_count, etc.)Base class for environment communication:
async with and await for all operations.sync() to get a SyncEnvClient for synchronous usageManage container deployment:
LocalDockerProvider: Run containers on local Docker daemonDockerSwarmProvider: Deploy to Docker Swarm clustersKubernetesProvider: Deploy to Kubernetes clustersUVProvider, DaytonaProvider: Additional runtime providersType-safe data structures:
Action: Base class for environment actionsObservation: Base class for environment observationsState: Episode state trackingStepResult: Combines observation, reward, done flagUse the CLI to quickly scaffold a new environment:
openenv init my_env
This creates the following structure:
my_env/
├── .dockerignore # Docker build exclusions
├── __init__.py # Export YourAction, YourObservation, YourEnv
├── models.py # Define Action, Observation, State dataclasses
├── client.py # Implement YourEnv(EnvClient)
├── README.md # Document your environment
├── openenv.yaml # Environment manifest
├── pyproject.toml # Dependencies and package configuration
├── outputs/ # Runtime outputs (logs, evals) - gitignored
│ ├── logs/
│ └── evals/
└── server/
├── your_environment.py # Implement YourEnvironment(Environment)
├── app.py # Create FastAPI app
├── requirements.txt # Dependencies for Docker (can be generated)
└── Dockerfile # Define container image
OpenEnv uses pyproject.toml as the primary dependency specification:
pyproject.toml: Each environment defines its own dependenciespyproject.toml: Contains shared core dependencies (fastapi, pydantic, uvicorn)requirements.txt: Can be auto-generated from pyproject.toml for Docker buildsDevelopment Workflow:
# Install environment in editable mode
cd my_env
pip install -e .
# Or using uv (faster)
uv pip install -e .
# Run server locally without Docker
uv run server --host 0.0.0.0 --port 8000
See envs/README.md for a complete guide on building environments.
To use an environment:
pip install git+https://huggingface.co/spaces/openenv/echo_envfrom echo_env import CallToolAction, EchoEnvAsync (recommended):
async with EchoEnv(base_url="...") as client:
result = await client.reset()
result = await client.step(action)
Sync (via .sync() wrapper):
with EchoEnv(base_url="...").sync() as client:
result = client.reset()
result = client.step(action)
See example scripts in examples/ directory.
The OpenEnv CLI provides commands to manage environments:
openenv init <env_name> - Initialize a new environment from templateopenenv push [--repo-id <repo>] [--private] - Deploy environment to Hugging Face Spacesopenenv serve - Serve an environment locally with optional auto-reloadopenenv build - Build the Docker image for an environmentopenenv fork <space-id> - Fork a Space from HF Hub to your accountopenenv validate - Validate an environment configuration# Create a new environment
openenv init my_game_env
# Deploy to Hugging Face (will prompt for login if needed)
cd my_game_env
openenv push
For detailed options run any command with --help.
# Clone the repository
git clone https://github.com/huggingface/OpenEnv.git
cd OpenEnv
# Install core package in editable mode
pip install -e .
# Or using uv (faster)
uv pip install -e .
OpenEnv uses a modular dependency structure: the core package is minimal, and each environment has its own dependencies. This means some tests require environment-specific packages.
# Install pytest (required for running tests)
uv pip install pytest
# Run all tests (skips tests requiring uninstalled dependencies)
PYTHONPATH=src:envs uv run pytest tests/ -v --tb=short
# Run a specific test file
PYTHONPATH=src:envs uv run pytest tests/envs/test_echo_environment.py -v
To run environment-specific tests, install that environment's dependencies:
# Example: Install coding_env with dev dependencies (includes smolagents + pytest)
uv pip install -e "envs/coding_env[dev]"
# Then run coding_env tests
PYTHONPATH=src:envs uv run pytest tests/envs/test_python_codeact_rewards.py -v
Tests will be automatically skipped if their required dependencies aren't installed.
OpenEnv works with a growing ecosystem of RL frameworks and platforms. If your project supports OpenEnv, open a PR to add it here.
See the TRL example on how to integrate OpenEnv environments with GRPO training.
See GRPO BlackJack training example: examples/grpo_blackjack/
See the 2048 game example based on gpt-oss: Colab notebook
See the SkyRL example on how to train on OpenEnv environments with SkyRL.
See the ART example on how OpenEnv environments can be used to train models with ART.
See the Oumi example on how OpenEnv environments can be used to train models with Oumi.
| Environment | Description |
|---|---|
| Echo Environment | Echoes back messages with metadata. Ideal for testing HTTP server infrastructure, learning framework basics, and verifying container deployment. |
| Coding Environment | Sandboxed Python code execution via smolagents. Captures stdout/stderr/exit codes, supports persistent episode context, and provides detailed error handling. |
| Chess Environment | Chess RL environment with configurable opponents and full rules support. |
| Atari Environment | Classic Arcade Learning Environment tasks for RL benchmarking. |
| FinRL Environment | Financial market simulations for algorithmic trading experiments. |
Browse the full catalog of community environments at huggingface.co/docs/openenv/environments.
OpenEnv is governed by a technical committee that coordinates project direction, major technical decisions, RFCs, and release planning through the public issue tracker, pull requests, and RFC process. Current committee members: Meta-PyTorch, Reflection, Unsloth, Modal, Prime Intellect, Nvidia, Mercor, Fleet AI, and Hugging Face.
The project is also supported by a broader community of organizations. If you would like to add your project or organization here, please open a pull request for maintainer review.
Supporters include: Meta-PyTorch, Hugging Face, Scaler AI Labs, Patronus AI, Surge AI, LastMile AI, Unsloth, Reflection, vLLM, SkyRL (UC-Berkeley), Lightning AI, Axolotl AI, Stanford Scaling Intelligence Lab, Mithril, OpenMined, Fleet AI, Halluminate, Turing, Scale AI, Scorecard
And we'd also like to acknowledge the team at Farama Foundation as the OpenEnv API was heavily inspired by the work you all have done on Gymnasium. Cheers!
BSD 3-Clause License (see LICENSE file)
同属 数据/基础设施 类型 · 适合同类用户的其他选择