token starvation
definition: occurs when long-lived token streams (llm responses) are degraded by background system activity.
- symptoms: stalled streams, increased time-to-first-byte, interference from heavy tool calls.
- detection: heuristics analysing mixed traffic patterns and stream latency during high concurrency.
- mitigation: prioritizing stream traffic, rate-limiting background workers.