When one of our customers ran a 5M-message broadcast, we had a choice: dispatch every job into a single Redis Stream and let the agents fight, or build a real scheduler.
We built the scheduler.
The design has three layers. First, a per-tenant token bucket sized to the customer's plan. Second, a per-node bucket sized to the node's warmup day. Third, a global fair-queue that round-robins across active tenants.
The key insight is that no single layer is enough. Per-tenant alone lets one customer hammer one node. Per-node alone lets one customer starve all others. The combination guarantees both per-tenant fairness and per-node deliverability protection.
Implementation is ~300 lines of Go. Runs as a goroutine inside the central process. Redis Streams provide the durable layer.
