Configuration Guide¶

Flux uses a hierarchical configuration system built on Pydantic. All configurations are validated at load time with sensible defaults.

Quick Start¶

YAML FilePythonEnvironment Variables

config.yaml

model_path: Qwen/Qwen3-8B
num_steps: 1000
algorithm: grpo

adaptive_async:
  target_staleness: 0.15

config = FluxConfig.from_yaml("config.yaml")

from flux import FluxConfig

config = FluxConfig(
    model_path="Qwen/Qwen3-8B",
    num_steps=1000,
    algorithm="grpo",
    adaptive_async={"target_staleness": 0.15},
)

export FLUX_MODEL_PATH="Qwen/Qwen3-8B"
export FLUX_NUM_STEPS=1000
flux train --prompts data/prompts.jsonl

Configuration Sections¶

Full Reference

Complete list of all configuration options

Reference
Training Config

Learning rate, batch size, steps, and optimization

Training
Adaptive Async Config

Staleness targets and PID controller settings

Adaptive Async
Rollout Config

Generation settings and APRIL strategy

Rollout
Algorithm Config

PPO, GRPO, DPO, and other algorithm settings

Algorithm
Example Configs

Ready-to-use configuration templates

Examples

Configuration Hierarchy¶

graph TD
    A[FluxConfig] --> B[model_path]
    A --> C[num_steps]
    A --> D[batch_size]
    A --> E[learning_rate]
    A --> F[AdaptiveAsyncConfig]
    A --> G[RolloutConfig]
    A --> H[AlgorithmConfig]
    A --> I[BatchComposerConfig]
    A --> J[WeightSyncConfig]
    A --> K[CheckpointConfig]
    A --> L[LoggingConfig]
    A --> M[DistributedConfig]

    F --> F1[target_staleness]
    F --> F2[kp, ki, kd]
    F --> F3[min/max_async_ratio]

    G --> G1[max_length]
    G --> G2[temperature]
    G --> G3[APRIL settings]

    H --> H1[name]
    H --> H2[clip_ratio]
    H --> H3[entropy_coef]

Common Configurations¶

Stable Training (Default)¶

model_path: Qwen/Qwen3-8B
num_steps: 5000
batch_size: 32
learning_rate: 1.0e-6

adaptive_async:
  target_staleness: 0.15
  min_async_ratio: 0.1
  max_async_ratio: 0.7

algorithm:
  name: grpo
  group_size: 4

High Throughput¶

model_path: Qwen/Qwen3-8B
num_steps: 10000
batch_size: 64
learning_rate: 5.0e-7

adaptive_async:
  target_staleness: 0.3  # Allow more staleness
  max_async_ratio: 0.9   # More async

rollout:
  april:
    oversample_ratio: 2.0
    batch_timeout: 20.0

Maximum Stability¶

model_path: Qwen/Qwen3-8B
num_steps: 5000
batch_size: 16
learning_rate: 5.0e-7

adaptive_async:
  target_staleness: 0.05  # Very fresh data
  max_async_ratio: 0.3    # Mostly sync

algorithm:
  name: ppo
  clip_ratio: 0.1
  kl_penalty: 0.2

Validation¶

Flux validates all configuration at load time:

from flux import FluxConfig
from pydantic import ValidationError

try:
    config = FluxConfig(
        model_path="invalid",
        num_steps=-1,  # Invalid!
    )
except ValidationError as e:
    print(f"Configuration errors: {e}")

Common validation errors:

Error	Cause	Fix
`num_steps must be positive`	Negative step count	Use positive integer
`learning_rate too high`	LR > 1e-4	Use smaller LR
`invalid algorithm name`	Typo in algorithm	Check spelling

Environment Variables¶

Any config value can be overridden via environment:

# Format: FLUX_<SECTION>_<PARAMETER>
export FLUX_NUM_STEPS=2000
export FLUX_LEARNING_RATE=1e-6
export FLUX_ADAPTIVE_ASYNC_TARGET_STALENESS=0.2
export FLUX_ALGORITHM_NAME=ppo

Priority order (highest to lowest):

Explicit Python arguments
Environment variables
YAML file values
Default values

Next Steps¶

Full Reference - All configuration options
Example Configs - Ready-to-use templates
Getting Started - Start training