DeepSeek API Setup for Continuous Learning GRPO¶
โ Current Status¶
- โ
API Key Saved:
.envfile created with your DeepSeek API key - โ
Gitignored:
.envis ignored by git (won't be committed) - โ API Key Valid: Key connects successfully to DeepSeek API
- โ ๏ธ Balance Required: Account needs credits to run training
๐ฐ Add Credits to DeepSeek Account¶
Your API Key: sk-82accfbadb484ea7ad986510f88d27f5
Steps to Add Credits:
- Go to DeepSeek Platform: https://platform.deepseek.com/usage
- Log in with your account
- Click "Billing" or "Add Credits"
- Add credits (minimum $5 recommended)
Cost Estimates: - Quick test (20 samples): ~\(0.10** - Full training (100 samples, 3 epochs): **~\)0.37 - Pricing: $0.14/1M input tokens, $0.28/1M output tokens
๐งช Test the Setup¶
Once you've added credits, test the API:
cd /Users/z/work/zoo/gym
# Load API key from .env
source .env
# Run API verification test
python test_deepseek_api.py
Expected output:
โ Client initialized
โ Response: 8
โ Summary generated (450 chars)
โ Extracted 2 operations
โ Consolidated to 1 final operations
โ ALL TESTS PASSED - DeepSeek API is working correctly!
๐ Run Training¶
After credits are added and the test passes:
Quick Test (20 samples, ~$0.10)¶
cd /Users/z/work/zoo/gym
source .env # Load API key
bash scripts/train_zen_eco_4b_grpo.sh alpaca_en_demo 1 5 20
Full Training (100 samples, ~$0.37)¶
cd /Users/z/work/zoo/gym
source .env # Load API key
bash scripts/train_zen_eco_4b_grpo.sh alpaca_en 3 5 100
๐ Security Notes¶
API Key Protection: - โ
API key is in .env (gitignored) - โ
Never commit .env to git - โ
Never hardcode API keys in scripts - โ
Always use source .env to load keys
To verify gitignore:
To check if .env would be committed:
๐ ๏ธ Troubleshooting¶
Error: "Insufficient Balance" (402)¶
Problem: DeepSeek account has no credits
Solution: Add credits at https://platform.deepseek.com/usage
Error: "No LLM API key found"¶
Problem: API key not loaded from .env
Solution: Run source .env before training
Error: "Invalid API key" (401)¶
Problem: Wrong API key or expired
Solution: Generate new key at https://platform.deepseek.com/api_keys
To check if .env is loaded:¶
๐ Expected Performance¶
Based on Tencent Paper Results:
| Metric | Value |
|---|---|
| Performance Gain | +1-3% on math tasks |
| Training Cost | $0.37 for 100 samples |
| Training Time | ~10 minutes |
| Experiences Generated | 50-100 after 3 epochs |
๐ Next Steps¶
- Add Credits: https://platform.deepseek.com/usage (minimum $5)
- Test API:
source .env && python test_deepseek_api.py - Run Training:
bash scripts/train_zen_eco_4b_grpo.sh alpaca_en_demo 1 5 20 - Analyze Results: Check
./output/zen-eco-4b-grpo/experiences.json
Last Updated: October 28, 2025
API Key: sk-82accfbadb484ea7ad986510f88d27f5
Status: Ready (needs credits)