Step-by-step guide to migrate from the legacy inference lifecycle APIs to geppetto/pkg/inference/session.
Migrate code that previously relied on:
geppetto/pkg/inference/state (InferenceState, StartRun/FinishRun, cancel plumbing),geppetto/pkg/inference/core (Session, RunInferenceStarted, lifecycle helpers),engine.WithSink / engine.Option),to the new, unified API:
geppetto/pkg/inference/session.Session (multi-turn state),EngineBuilder + InferenceRunner (blocking inference entrypoint),ExecutionHandle (cancel + wait for an in-flight inference),events.WithEventSinks).No backwards compatibility is assumed.
Old:
engine.WithSink(sink))New:
events.WithEventSinks(ctx, sinks...)enginebuilder.Builder.EventSinksOld:
InferenceState mixed long-lived state (turn) with in-flight state (running + cancel)core.Session added additional lifecycle entrypoints (RunInferenceStarted) to support “start-but-not-yet-running” patternsNew:
Session owns turn history and enforces “one active inference per session”Session.StartInference(ctx) starts the run and returns an ExecutionHandleExecutionHandle.Wait() is the single “blocking join” pointNew (only):
eng, err := factory.NewEngineFromParsedValues(parsedValues)
runCtx := events.WithEventSinks(ctx, sink)
_, err = eng.RunInference(runCtx, seed)
| Legacy concept | New concept |
|---|---|
engine.WithSink(sink) | events.WithEventSinks(ctx, sink) (or enginebuilder.Builder.EventSinks) |
InferenceState | session.Session |
StartRun/FinishRun | Session.IsRunning() + Session.StartInference() |
SetCancel/HasCancel | ExecutionHandle.Cancel() / Session.CancelActive() |
RunInferenceStarted | not needed (start is immediate; wait is explicit) |
Search for:
engine.WithSinkengine.OptionInferenceStatecore.SessionRunInferenceStartedStartRun / FinishRunReplace:
Previously you may have been passing “engine-config” options at construction time (e.g., to attach sinks). Those options are removed.
With:
eng, err := factory.NewEngineFromParsedValues(parsedValues)
if err != nil { return err }
Then attach sinks at runtime:
runCtx := events.WithEventSinks(ctx, sink)
_, err = eng.RunInference(runCtx, seed)
Build a single runner that owns:
Typical shape:
base, _ := factory.NewEngineFromParsedValues(parsedValues)
b := enginebuilder.New(
enginebuilder.WithBase(base),
enginebuilder.WithMiddlewares(/* system prompt, logging, etc */),
enginebuilder.WithEventSinks(sink),
enginebuilder.WithToolRegistry(registry), // optional: enables tool loop
// enginebuilder.WithToolConfig(*toolCfg), // optional
)
Old (conceptually):
InferenceStateInferenceStateNew:
sess := &session.Session{
SessionID: sessionID,
Builder: b,
}
sess.Append(seedTurn) // append the turn you want to run
handle, err := sess.StartInference(ctx)
if err != nil { return err }
out, err := handle.Wait()
Notes:
Session.StartInference runs asynchronously; Wait() blocks.Session.AppendNewTurnFromUserPrompt(...) instead of manually
cloning sess.Latest().Replace “store cancel on state” with:
_ = sess.CancelActive()
// or if you hold the handle:
handle.Cancel()
The chat.Backend should:
tea.Cmd that blocks on handle.Wait() and emits BackendFinishedMsg.Pseudo:
func (b *EngineBackend) Start(ctx context.Context, prompt string) (tea.Cmd, error) {
_, err := b.sess.AppendNewTurnFromUserPrompt(prompt)
if err != nil { return nil, err }
handle, err := b.sess.StartInference(ctx)
if err != nil { return nil, err }
return func() tea.Msg {
_, _ = handle.Wait()
return chat.BackendFinishedMsg{}
}, nil
}
Replace StartRun()/FinishRun() with:
sess.IsRunning()sess.AppendNewTurnFromUserPrompt(prompt))sess.StartInference(ctx) and return immediatelyAfter all callers are migrated:
geppetto/pkg/inference/coregeppetto/pkg/inference/statego test ./... -count=1 in geppetto/ and downstream reposgeppetto/cmd/examples/streaming-inferencegeppetto/cmd/examples/advanced/generic-tool-callinggeppetto/cmd/examples/advanced/openai-tools (Responses thinking)