Zen Model Configuration Summary¶
Configuration Files Validated ✅¶
Dense Models:¶
- zen-nano-0.6b: 40K context, GSPO training
- zen-eco-4b: 256K context, GSPO training
MoE Models:¶
- zen-omni-30b-a3b: 256K context, multimodal, GSPO training
- zen-coder-30b-a3b: 256K context, code-optimized, GSPO training
- zen-coder-480b-a35b: 256K context, max-scale code, GSPO training
- zen-next-80b-a3b: 256K context, next-gen architecture, GSPO training
Quantized Training:¶
- zen-nano-qlora: 40K context, 4-bit QLoRA for efficient training
Changes Made:¶
- Created zen_coder_30b.yaml configuration
- Created zen_coder_480b.yaml configuration
- Updated zen_nano_qlora.yaml with correct zenlm/zen-* naming
- Fixed zen_coder.yaml symlink to point to zen_coder_30b.yaml
- Verified all configs use consistent structure and GSPO parameters
All configurations now properly use the zenlm/zen-* naming convention and have optimized GSPO parameters for their respective model sizes.