LLM Proxy Overview

OpenAI-compatible proxy backed by Geppetto profile runtime configuration.

Topicllm-proxyopenaigeppettollm-proxy-serverllm-proxy-server servelistenprofiles

llm-proxy-server exposes an OpenAI-compatible HTTP API and translates requests into Geppetto inference calls. Provider credentials and model routing live in Geppetto profile YAML; the proxy itself does not store API keys or provider routing tables.

The main runtime flow is:

Run the Glazed-backed serve command.
Load optional profile YAML from --profiles.
Build OpenAI-compatible model, completion, and chat-completion services from those profiles.
Start an HTTP server on --listen.
Serve /healthz, /v1/models, /v1/completions, and /v1/chat/completions.

Example:

llm-proxy-server serve --profiles examples/profiles.yaml --listen 127.0.0.1:8080

LLM Proxy Overview

OpenAI-compatible proxy backed by Geppetto profile runtime configuration.

Sections

LLM Proxy Overview