A Token Window, also called a context window, is the maximum number of tokens a language model can consider simultaneously when processing a request.
Voice AI platforms optimize Token Windows to retain important conversation history, retrieved knowledge, and user context while balancing inference costs, latency, and model performance.