Abstract
The rapid adoption of large-language models (LLMs) has shifted enterprise AI from model-centric experimentation to production-scale, policy-constrained deployments. Traditional "LLM-as-a-service"governance breaks down when LLMs are embedded in agentic execution loops that plan, act, observe, and adapt while calling external tools. This paper traces the architectural evolution from embedded wrappers and sidecar proxies to a multi-plane, dual-proxy gateway, in which lightweight edge proxies and heavyweight core proxies cooperate to provide low-latency guardrails, global policy enforcement, and verifiable attestation chains. We introduce a governance-as-code approach that compiles compliance workflows into WebAssembly (Wasm) modules. Edge proxies execute these modules, obtain cryptographic signatures from specialized microservices, and forward only fully attested prompts to the core proxy, which ultimately forwards them to the LLM. Micro-benchmarks show that Wasm-mediated validation adds ≤ 20 ns overhead for CPU-bound tasks and ≈ 120 ns when serializing complex data types negligible relative to LLM inference times. The design achieves auditable, decentralized governance with good performance, laying the groundwork for high-assurance, tool-using agentic AI in the enterprise.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 122-130 |
| Number of pages | 9 |
| Journal | Procedia Computer Science |
| Volume | 272 |
| DOIs | |
| State | Published - 2025 |
| Event | 16th International Conference on Emerging Ubiquitous Systems and Pervasive Networks, EUSPN 2025 / 15th International Conference on Current and Future Trends of Information and Communication Technologies in Healthcare, ICTH 2025 - Istanbul, Turkey Duration: Oct 28 2025 → Oct 30 2025 |
All Science Journal Classification (ASJC) codes
- General Computer Science