Skip to content

Flux

Adaptive Post-Training Framework for LLMs

The best of all worlds — Synchronous stability + Asynchronous efficiency + Native simplicity

Get Started Tutorials API Reference


Why Flux?

Existing RLHF frameworks force you to choose between stability (synchronous training) and efficiency (asynchronous training). Flux breaks this false dichotomy with adaptive async control that dynamically adjusts based on training dynamics.

  • Adaptive Async


    Dynamically adjusts sync/async ratio based on measured staleness. Get 85% GPU utilization with synchronous-level stability.

    Learn more

  • Native Performance


    Direct Megatron-LM + SGLang integration without Ray overhead. Maximum performance with minimal abstraction.

    Architecture

  • Algorithm Agnostic


    Support for PPO, GRPO, DPO, REINFORCE, DAPO, RLOO, and easy extensibility for custom algorithms.

    Algorithms

  • Simple & Extensible


    Less than 5,000 lines of core code. Easy to understand, debug, and extend for your research needs.

    Contributing


The Spectrum, Not a Binary Choice

Sync ◄────────────────────────────────────────────────────► Async

     VERL        ████████████░░░░░░░░░░░░░░░░░░  Stable but slow
     AReaL       ░░░░░░░░░░░░░░░░░░████████████  Fast but risky
     Flux        ◄═══════ adapts here ═══════►  Best of both

Flux treats the sync/async ratio as a continuous control variable, not a binary choice. A PID controller maintains your target staleness level, automatically adjusting based on real-time training dynamics.


Quick Comparison

Aspect VERL AReaL Slime Flux
Sync Strategy Fixed sync Fixed async Both modes Adaptive
Orchestration Ray Custom HTTP asyncio
Training Backend Megatron/FSDP Custom Megatron Megatron
Inference Backend vLLM/SGLang Custom SGLang SGLang
Weight Sync Ray Object Store Custom CUDA IPC CUDA IPC
Staleness Handling N/A Staleness-aware APRIL Unified
Code Complexity ~15k LOC ~25k LOC ~8k LOC <5k LOC

Quick Start

Installation

pip install flux-rlhf

# Or from source
git clone https://github.com/flux-team/flux.git
cd flux && pip install -e ".[dev]"

Basic Training

from flux import FluxConfig, FluxTrainer

config = FluxConfig(
    model_path="Qwen/Qwen3-8B",
    adaptive_async={
        "target_staleness": 0.15,
        "min_async_ratio": 0.1,
        "max_async_ratio": 0.9,
    },
    algorithm="grpo",
)

trainer = FluxTrainer(config)
trainer.fit(prompts="data/prompts.jsonl")

Full Getting Started Guide


Supported Algorithms

Algorithm Type Best For
PPO On-policy Stable general training
GRPO On-policy Multi-sample efficiency
DPO Preference Direct preference learning
REINFORCE On-policy Simple baselines
DAPO On-policy High-variance rewards
RLOO On-policy Variance reduction

All Algorithms


Architecture

graph TB
    subgraph Control["Adaptive Control Plane"]
        AC[Adaptive Async Controller]
        BC[Smart Batch Composer]
        SM[Staleness Monitor]
    end

    subgraph Coordinator["Lightweight Coordinator"]
        CO[FluxCoordinator]
        WS[Weight Sync Manager]
    end

    subgraph Engines["Native Execution Engines"]
        ME[Megatron Engine]
        SG[SGLang Server]
    end

    AC --> CO
    BC --> CO
    SM --> AC
    CO --> ME
    CO --> SG
    WS --> ME
    WS --> SG
    ME <-->|CUDA IPC| SG

Architecture Deep Dive


Performance Targets

Metric Target Description
GPU Utilization > 80% Measured via nvidia-smi
Throughput 2x VERL Samples per hour
Staleness Mean < 0.2 Combined staleness metric
Scaling > 85% at 64 GPUs Linear scaling efficiency

Community

  • GitHub


    Star the repo, report issues, and contribute code.

    flux-team/flux

  • Discord


    Join our community for discussions and support.

    Join Discord

  • Documentation


    Comprehensive guides, tutorials, and API reference.

    Read the Docs


Citation

If you use Flux in your research, please cite:

@software{flux2025,
  title  = {Flux: An Adaptive Post-Training Framework for LLMs},
  year   = {2025},
  url    = {https://github.com/flux-team/flux}
}

Flux: Where stability meets efficiency

Apache 2.0 License · Release Notes · Contributing