Weight Synchronization¶
Flux synchronizes weights between training (Megatron) and inference (SGLang) efficiently.
Sync Methods¶
| Method | Latency | Use Case |
|---|---|---|
| CUDA IPC | ~10ms | Same node |
| NCCL | ~100ms | Cross-node |
| HTTP | ~1s | Fallback |
Delta Compression¶
Only transfer changed weights:
Typical compression: 60-80% bandwidth reduction.