Skip to content

DeepSeek API Setup for Continuous Learning GRPO

โœ… Current Status

  • โœ… API Key Saved: .env file created with your DeepSeek API key
  • โœ… Gitignored: .env is ignored by git (won't be committed)
  • โœ… API Key Valid: Key connects successfully to DeepSeek API
  • โš ๏ธ Balance Required: Account needs credits to run training

๐Ÿ’ฐ Add Credits to DeepSeek Account

Your API Key: sk-82accfbadb484ea7ad986510f88d27f5

Steps to Add Credits:

  1. Go to DeepSeek Platform: https://platform.deepseek.com/usage
  2. Log in with your account
  3. Click "Billing" or "Add Credits"
  4. Add credits (minimum $5 recommended)

Cost Estimates: - Quick test (20 samples): ~\(0.10** - Full training (100 samples, 3 epochs): **~\)0.37 - Pricing: $0.14/1M input tokens, $0.28/1M output tokens


๐Ÿงช Test the Setup

Once you've added credits, test the API:

cd /Users/z/work/zoo/gym

# Load API key from .env
source .env

# Run API verification test
python test_deepseek_api.py

Expected output:

โœ“ Client initialized
โœ“ Response: 8
โœ“ Summary generated (450 chars)
โœ“ Extracted 2 operations
โœ“ Consolidated to 1 final operations
โœ“ ALL TESTS PASSED - DeepSeek API is working correctly!


๐Ÿš€ Run Training

After credits are added and the test passes:

Quick Test (20 samples, ~$0.10)

cd /Users/z/work/zoo/gym
source .env  # Load API key
bash scripts/train_zen_eco_4b_grpo.sh alpaca_en_demo 1 5 20

Full Training (100 samples, ~$0.37)

cd /Users/z/work/zoo/gym
source .env  # Load API key
bash scripts/train_zen_eco_4b_grpo.sh alpaca_en 3 5 100

๐Ÿ”’ Security Notes

API Key Protection: - โœ… API key is in .env (gitignored) - โœ… Never commit .env to git - โœ… Never hardcode API keys in scripts - โœ… Always use source .env to load keys

To verify gitignore:

git status .env
# Should show: ".env is gitignored"

To check if .env would be committed:

git check-ignore .env
# Should output: ".env"


๐Ÿ› ๏ธ Troubleshooting

Error: "Insufficient Balance" (402)

Problem: DeepSeek account has no credits
Solution: Add credits at https://platform.deepseek.com/usage

Error: "No LLM API key found"

Problem: API key not loaded from .env
Solution: Run source .env before training

Error: "Invalid API key" (401)

Problem: Wrong API key or expired
Solution: Generate new key at https://platform.deepseek.com/api_keys

To check if .env is loaded:

echo $DEEPSEEK_API_KEY
# Should show: sk-82accfbadb484ea7ad986510f88d27f5

๐Ÿ“Š Expected Performance

Based on Tencent Paper Results:

Metric Value
Performance Gain +1-3% on math tasks
Training Cost $0.37 for 100 samples
Training Time ~10 minutes
Experiences Generated 50-100 after 3 epochs

๐Ÿ“ Next Steps

  1. Add Credits: https://platform.deepseek.com/usage (minimum $5)
  2. Test API: source .env && python test_deepseek_api.py
  3. Run Training: bash scripts/train_zen_eco_4b_grpo.sh alpaca_en_demo 1 5 20
  4. Analyze Results: Check ./output/zen-eco-4b-grpo/experiences.json

Last Updated: October 28, 2025
API Key: sk-82accfbadb484ea7ad986510f88d27f5
Status: Ready (needs credits)