Peak hours: 14:00–18:00 UTC+8. Off-peak: all other times. Monthly quota is equivalent to ~15–30× the subscription fee, converted at API pricing rates.
| Model | Quota (Peak: 14:00-18:00 UTC+8) | Quota (Off-Peak) | Temporary (thru June) |
|---|---|---|---|
| GLM-5.1 | 3× | 2× | 1× off-peak |
| GLM-5-Turbo | 3× | 2× | 1× off-peak |
| GLM-4.7 | 1× | 1× | 1× |
| GLM-4.5-Air | 1× | 1× | 1× |
Principle: use the flagship model (GLM-5.1) where reasoning quality directly impacts output, and the 1× models (GLM-4.7, GLM-4.5-Air) for routine or lightweight work to conserve quota.
| Agent | Model | Rationale | Quota |
|---|---|---|---|
| build | GLM-5.1 | Primary agent — flagship for complex coding tasks | 2-3× |
| plan | GLM-5.1 | Deepest reasoning needed for planning complex changes | 2-3× |
| general | GLM-4.7 | Subagent multi-step tasks — strong coding/reasoning at 1× | 1× |
| explore | GLM-4.7 | Code exploration — good understanding, 1× quota | 1× |
| compaction | GLM-5.1 | Critical for context quality — needs best reasoning | 2-3× |
| title | GLM-4.5-Air | Simple task, lightweight sufficient | 1× |
| summary | GLM-4.5-Air | Simple task, lightweight sufficient | 1× |
- Enabled (default) on build, plan, general, explore, compaction — these chain multiple tool calls and benefit from interleaved reasoning between steps.
- Disabled on title and summary — trivial tasks that don't need reasoning, disabling saves quota and latency.
- Z.AI Pricing — per-token API costs (not directly applicable to coding plans but shows relative model cost).
- GLM Coding Plan Overview — quota multipliers, plan tiers, and peak/off-peak rules.
- OpenCode Agents Docs — how to configure per-agent model overrides.