Engine-only profile registries for resolving `InferenceSettings` in Geppetto.
Geppetto now treats profiles as engine profiles only.
An engine profile answers one question:
InferenceSettings should be used to build the engine?It does not answer:
Those are application concerns and belong in Pinocchio, GEC-RAG, Temporal Relationships, or another host.
The engine profile domain lives in pkg/engineprofiles.
The key types are:
EngineProfileEngineProfileRegistryResolvedEngineProfileResolveInputInferenceSettingsConceptually:
engine profile registry
-> resolve one engine profile slug
-> expand stack layers
-> merge engine settings
-> produce final InferenceSettings
An engine profile contains:
slugdisplay_namedescriptionstackinference_settingsmetadataextensionsMinimal YAML:
slug: provider-openai
profiles:
default:
slug: default
inference_settings:
api:
api_keys:
openai-api-key: demo-openai-key
chat:
api_type: openai
engine: gpt-4o-mini
Stacked profile:
slug: team-agent
profiles:
assistant:
slug: assistant
stack:
- registry_slug: provider-openai
profile_slug: default
inference_settings:
chat:
api_type: openai-responses
engine: gpt-5-mini
Engine profiles can optionally describe static model metadata under inference_settings.model_info.
model_info is profile/catalog data. It is not a per-request inference override. Use it for information that should travel with the selected model:
Example:
slug: provider-openai
profiles:
default:
slug: default
inference_settings:
chat:
api_type: openai
engine: gpt-4o-mini
model_info:
id: gpt-4o-mini
name: GPT-4o Mini
reasoning: false
input:
- text
- image
context_window: 128000
quality_high_watermark: 128000
max_output_tokens: 16384
cost:
input: 0.15
output: 0.60
cache_read: 0.075
cache_write: 0.30
metadata:
family: gpt-4o
Field semantics:
context_window is the hard model context limit.quality_high_watermark is the preferred planning limit when quality is known to degrade before the hard context limit. If omitted, consumers may treat context_window as both the quality and hard limit.cost values are USD per one million tokens. A missing cost means unknown; an all-zero cost means free/local.metadata is a JSON/YAML-compatible map[string]any for provider-specific fields.Merge semantics follow normal profile stack rules: set fields in the overlay win, nil fields fall back to the base profile. metadata maps merge recursively. cost is replaced as a unit so partial overlays do not accidentally preserve stale base rates.
Resolved metadata is available on ResolvedEngineProfile.InferenceSettings.ModelInfo, on JS resolved.modelInfo, and on JS engine objects built from resolved profiles.
ResolveEngineProfile(...) returns:
registrySlugprofileSluginferenceSettingsprofile.registry, profile.slug, profile.version, and stack lineageIt no longer returns:
effectiveRuntimeruntimeKeyruntimeFingerprintThose were part of the older mixed runtime model and were removed in the hard cut.
The old model conflated two different domains:
That caused Geppetto to know too much about system prompts, middleware registries, tool allowlists, and app-level runtime metadata.
The current split is:
Geppetto
engine profile registry
-> InferenceSettings
-> engine
App
system prompt
middlewares
tools
runtime metadata
-> actual run behavior
Geppetto owns:
InferenceSettingsApps own:
Relevant files:
Go pseudocode:
entries, _ := engineprofiles.ParseEngineProfileRegistrySourceEntries(rawSources)
specs, _ := engineprofiles.ParseRegistrySourceSpecs(entries)
chain, _ := engineprofiles.NewChainedRegistryFromSourceSpecs(ctx, specs)
defer chain.Close()
resolved, _ := chain.ResolveEngineProfile(ctx, engineprofiles.ResolveInput{
EngineProfileSlug: engineprofiles.MustEngineProfileSlug("assistant"),
})
eng, _ := enginefactory.NewEngineFromSettings(resolved.InferenceSettings)
Then the app adds its own runtime behavior on top:
resolved engine profile -> engine
app runtime config -> prompt + middleware + tools
engine + app runtime -> session / runner
This section explains the most important lifecycle distinction behind the profile system.
Geppetto profile resolution is not "the profile is the whole runtime." The profile is an overlay. The host application is expected to keep a baseline InferenceSettings object and merge the resolved profile on top of it.
Conceptually:
app-owned base InferenceSettings
+ resolved engine-profile InferenceSettings overlay
= final InferenceSettings used to build the engine
That is why profile docs should be read together with bootstrap docs:
This distinction matters whenever a setting is cross-profile and should survive profile changes. Transport and operator settings such as ai-client.* belong in the app-owned baseline, not in engine profiles. Model-selection defaults such as chat.engine are much more natural as profile overlays.
Applications sometimes need a baseline even when a command does not visibly expose the whole Geppetto AI surface on its CLI.
The bootstrap helpers in geppetto/pkg/cli/bootstrap support that pattern by reconstructing a hidden base InferenceSettings from the shared Geppetto sections plus config, environment, and defaults. The important consequence is:
That is why the ownership boundary matters so much. If a field belongs in a shared section such as ai-client, it can participate naturally in the hidden base lifecycle.
Use this rule of thumb:
Examples that fit well in profiles:
chat.enginechat.api_typemodel_info metadata such as model capabilities, context limits, and cost ratesExamples that fit better in the shared baseline:
ai-client.timeoutgeppetto/pkg/doc/tutorials/09-migrating-cli-commands-to-glazed-bootstrap-profile-resolution.md for the bootstrap/base/final settings lifecyclepinocchio/pkg/doc/topics/pinocchio-profile-resolution-and-runtime-switching.md for the application-side runtime profile-switch pattern built on top of this Geppetto model