---
title: LLM Proxy Overview
description: OpenAI-compatible proxy backed by Geppetto profile runtime configuration.
doc_version: 1
last_updated: 2026-07-02
---


`llm-proxy-server` exposes an OpenAI-compatible HTTP API and translates requests into Geppetto inference calls. Provider credentials and model routing live in Geppetto profile YAML; the proxy itself does not store API keys or provider routing tables.

The main runtime flow is:

1. Run the Glazed-backed `serve` command.
2. Load optional profile YAML from `--profiles`.
3. Build OpenAI-compatible model, completion, and chat-completion services from those profiles.
4. Start an HTTP server on `--listen`.
5. Serve `/healthz`, `/v1/models`, `/v1/completions`, and `/v1/chat/completions`.

Example:

```bash
llm-proxy-server serve --profiles examples/profiles.yaml --listen 127.0.0.1:8080
```