Gym: Complete AI Training Platform by Zoo Labs Foundation¶
🚀 Overview¶
Gym is a comprehensive, production-ready AI training platform that combines state-of-the-art algorithms with cutting-edge optimization techniques. Built by Zoo Labs Foundation, it provides everything needed for training, fine-tuning, and serving large language models at scale.
🎯 Core Features¶
Training Algorithms¶
| Algorithm | Description | Memory | Speed | Use Case |
|---|---|---|---|---|
| SFT | Supervised Fine-Tuning | Standard | Fast | General fine-tuning |
| DPO | Direct Preference Optimization | Efficient | Fast | Preference alignment |
| PPO | Proximal Policy Optimization | High | Moderate | RLHF |
| ORPO | Odds Ratio Preference Optimization | Moderate | Fast | Simplified RLHF |
| GRPO | Group Relative Policy Optimization | -40% | Fast | DeepSeek's method |
| GSPO | Group Sequence Policy Optimization | -60% | Fast | Alibaba's Qwen3 method |
| KTO | Kahneman-Tversky Optimization | Efficient | Fast | Human preference |
| RLOO | Reinforcement Learning with Leave-One-Out | Moderate | Fast | Online RLHF |
Model Architecture Support¶
Zen Model Family (Qwen3-based)¶
- Nano (0.6B): Ultra-lightweight for edge deployment
- Eco (4B): Balanced performance matching Qwen2.5-7B
- Coder (7B/30B/480B): Specialized for code generation
- Standard: 30B total, 3.3B active (MoE)
- Max: 480B total, 35B active (MoE)
- Thinking variants with CoT reasoning
- Omni (14B/30B): Multimodal capabilities
- Next (32B/80B): Ultra-sparse MoE with 512 experts
Quantization Technologies¶
BitDelta (ZIP-7)¶
- 1-bit quantization of fine-tune deltas
- 10× memory reduction for personalized models
- 60% reduction in jailbreak risks
- Byzantine-robust community aggregation
from gym.quantization import BitDeltaConfig, BitDeltaQuantizer
config = BitDeltaConfig(
bits=1,
group_size=128,
safety_threshold=0.6,
enable_deltasoup=True
)
quantizer = BitDeltaQuantizer(config)
DeltaQuant¶
- Flexible quantization (INT2/4/8, Binary, Ternary)
- Per-channel and per-tensor quantization
- Mixed precision support
- Calibration-based optimization
from gym.quantization import DeltaQuantConfig, QuantMethod
config = DeltaQuantConfig(
method=QuantMethod.INT4,
per_channel=True,
calibration_samples=256
)
DeltaSoup¶
- Community-driven model improvement
- Byzantine-robust aggregation
- Differential privacy support
- Contributor rewards system
from gym.quantization import DeltaSoupConfig, AggregationMethod
config = DeltaSoupConfig(
method=AggregationMethod.BYZANTINE_ROBUST,
differential_privacy=True,
enable_rewards=True
)
Optimization Features¶
Memory Optimization¶
- Unsloth: 2-3× training speedup
- Flash Attention 2: Efficient attention computation
- Gradient Checkpointing: Trade compute for memory
- QLoRA: 4-bit quantized LoRA training
- Mixed Precision: FP16/BF16 training
Performance Features¶
- Multi-GPU Support: DDP, FSDP, DeepSpeed
- Dynamic Batching: Adaptive batch size
- Gradient Accumulation: Simulate larger batches
- CPU Offloading: Use system RAM for large models
- KV Cache Optimization: Efficient inference
Infrastructure¶
Training Infrastructure¶
- Distributed Training: Scale to multiple nodes
- Checkpoint Management: Automatic save/resume
- Experiment Tracking: Weights & Biases, TensorBoard
- Hyperparameter Tuning: Optuna integration
Serving Infrastructure¶
- Model Export: GGUF, ONNX, TorchScript
- API Server: FastAPI-based inference
- Batch Inference: Efficient bulk processing
- Model Merging: Combine LoRA adapters
📦 Installation¶
# Clone repository
git clone https://github.com/zooai/gym
cd gym
# Install with all features
pip install -e ".[all]"
# Or minimal installation
pip install -e .
🚀 Quick Start¶
Command Line Interface¶
# Train with GSPO (Qwen3 optimized)
gym train models/nano/configs/gspo_training.yaml
# Fine-tune with BitDelta quantization
gym train configs/bitdelta_training.yaml
# Launch WebUI
gym webui --port 8080
# Export model
gym export --model saves/model --format gguf
Python API¶
from gym import Trainer, TrainingConfig
from gym.quantization import BitDeltaQuantizer
# Configure training
config = TrainingConfig(
model="Qwen/Qwen3-4B",
algorithm="gspo",
quantization="bitdelta",
use_unsloth=True
)
# Initialize trainer
trainer = Trainer(config)
# Train model
trainer.train(dataset="alpaca_gpt4_en")
# Apply community improvements
trainer.apply_deltasoup()
# Export quantized model
trainer.export("model.gguf", quantization="bitdelta")
🎨 WebUI Features¶
The WebUI provides a comprehensive interface with:
- Black Monochromatic Theme: Professional Zoo Labs branding
- Real-time Training Monitoring: Loss curves, metrics
- Model Management: Load, save, merge models
- Dataset Browser: Preview and filter datasets
- Hyperparameter Tuning: Interactive configuration
- Inference Playground: Test models interactively
🔬 Advanced Features¶
Personalized Models¶
Create millions of personalized model variants using BitDelta:
from gym.personalization import PersonalizedTrainer
trainer = PersonalizedTrainer(base_model="Qwen/Qwen3-4B")
trainer.create_variant(user_id="user_123", preferences=user_prefs)
trainer.serve_variant("user_123") # 10× memory efficient
Community Learning¶
Aggregate improvements from multiple users:
from gym.quantization import DeltaSoup
soup = DeltaSoup(config)
soup.contribute(user_id="alice", model=model_alice)
soup.contribute(user_id="bob", model=model_bob)
aggregated = soup.aggregate(min_contributors=3)
Safety Features¶
- Automatic jailbreak detection and prevention
- Content filtering and safety checks
- Byzantine-robust aggregation
- Differential privacy support
📊 Benchmarks¶
| Model | Method | Memory | Speed | Quality |
|---|---|---|---|---|
| Qwen3-4B | Standard | 16GB | 1.0× | 100% |
| Qwen3-4B | Unsloth | 12GB | 2.3× | 100% |
| Qwen3-4B | QLoRA | 6GB | 1.5× | 99% |
| Qwen3-4B | BitDelta | 1.6GB | 3.0× | 98% |
| Qwen3-4B | DeltaQuant-INT4 | 4GB | 2.0× | 99% |
🛠️ Configuration Examples¶
GSPO Training (Qwen3 Optimized)¶
model: Qwen/Qwen3-4B
algorithm: gspo
group_size: 8
sequence_parallel: true
use_unsloth: true
quantization:
method: bitdelta
bits: 1
enable_deltasoup: true
Multi-GPU Training¶
Production Serving¶
🔗 Integrations¶
- Hugging Face Hub: Direct model loading/saving
- Weights & Biases: Experiment tracking
- LangChain: Chain-of-thought integration
- vLLM: High-performance serving
- TensorRT: NVIDIA optimization
- ONNX Runtime: Cross-platform deployment
📚 Documentation¶
🏆 Why Choose Gym?¶
- Complete Solution: Everything from training to serving
- State-of-the-art: Latest algorithms (GRPO, GSPO)
- Memory Efficient: 10× reduction with BitDelta
- Production Ready: Battle-tested at scale
- Community Driven: DeltaSoup aggregation
- Safety First: Built-in safety features
- Zoo Ecosystem: Integrated with Zoo Labs tools
🤝 Contributing¶
We welcome contributions! See CONTRIBUTING.md for guidelines.
📄 License¶
Apache 2.0 - See LICENSE for details.
🙏 Acknowledgments¶
Built on the shoulders of giants: - Hugging Face Transformers - DeepSpeed & FSDP - Unsloth optimizations - Flash Attention - And the amazing open-source community
Gym by Zoo Labs Foundation - Training AI at the speed of thought 🚀
Copyright 2025 Zoo Labs Foundation Inc.