Name: Inference Gateway
Author: PromiseGameFi

Implement gateway logic in this order:

Validate and normalize request input.
Estimate prompt + generation token budget before admission.
Apply tenant token bucket and global concurrency checks.
Compute KV-cache pressure and shed early when threshold is exceeded.
Build scheduler job with immutable metadata (tenant_id, adapter_id, request_id, token budget).
Emit gateway metrics for accepted and rejected traffic.

Operational Rules

Contracts

Implement gateway logic in this order:

Validate and normalize request input.
Estimate prompt + generation token budget before admission.
Apply tenant token bucket and global concurrency checks.
Compute KV-cache pressure and shed early when threshold is exceeded.
Build scheduler job with immutable metadata (tenant_id, adapter_id, request_id, token budget).
Emit gateway metrics for accepted and rejected traffic.