Step-by-step tutorial to build a Cobra command that streams model output and supports tool calling using Geppetto.
This tutorial explains how to build a Cobra command that performs streaming inference and supports tool calling using Geppetto. We follow the engine-first architecture: engines handle provider I/O and emit events, while toolloop.Loop orchestrates tool execution. The focus is on concepts, small runnable snippets, and the key APIs you will use, not a single large code dump. See the style guide for expectations around examples and structure: glaze help how-to-write-good-documentation-pages.
For foundational background, see:
Note: Provider engines learn about available tools from the tool registry attached to context.Context (see tools.WithRegistry). This tutorial shows that wiring in Step 6 (the tool loop also attaches the registry automatically).
factory.NewEngineFromParsedValues(parsed)middleware.NewWatermillSink(publisher, topic)events.WithEventSinks(ctx, sink) (attach sinks at runtime)events.NewEventRouter()events.StepPrinterFunc(prefix, w) or events.NewStructuredPrinter(w, options)tools.NewInMemoryToolRegistry()tools.NewToolFromFunc(name, description, func)turns.NewSystemTextBlock(...) / turns.NewUserTextBlock(...)toolloop.Loop.RunLoop(ctx, turn)Create a command description with arguments and flags, including profile to load provider layers.
// inside NewStreamingCmd()
geppettoSections, err := geppettosections.CreateGeppettoSections()
if err != nil { return nil, err }
desc := cmds.NewCommandDescription(
"stream-with-tools",
cmds.WithShort("Streaming inference with tools"),
cmds.WithArguments(
fields.New("prompt", fields.TypeString, fields.WithHelp("Prompt")),
),
cmds.WithFlags(
fields.New("profile", fields.TypeString, fields.WithDefault("4o-mini")),
fields.New("output-format", fields.TypeString, fields.WithDefault("text")),
fields.New("with-metadata", fields.TypeBool, fields.WithDefault(false)),
),
cmds.WithSections(geppettoSections...),
)
The event router moves tokens and tool events through Watermill. Attach a human-readable printer or a structured one.
router, _ := events.NewEventRouter()
defer router.Close()
// Text printer for humans
router.AddHandler("chat", "chat", events.StepPrinterFunc("", os.Stdout))
// OR machine-readable output
// printer := events.NewStructuredPrinter(os.Stdout, events.PrinterOptions{
// Format: events.PrinterFormat("json"), IncludeMetadata: false,
// })
// router.AddHandler("chat", "chat", printer)
sink := middleware.NewWatermillSink(router.Publisher, "chat")
Important: events.NewEventRouter() defaults to Watermill’s in-memory gochannel with BlockPublishUntilSubscriberAck=true and no output buffering. For streaming UIs or high-rate handlers, prefer configuring the pub/sub explicitly as shown in glaze help geppetto-events-streaming-watermill (section “in-memory router defaults can block streaming”).
Why this matters: the sink ties your engine and helpers to the router so that tokens and tool activity can be streamed and printed as they happen.
Create the engine normally. Streaming events are emitted to the sinks attached to the runtime context.
eng, err := factory.NewEngineFromParsedValues(parsed)
if err != nil { return err }
Use the in-memory registry and define a simple tool. Providers that support built-in tool schemas can be configured with those definitions.
registry := tools.NewInMemoryToolRegistry()
getWeather, _ := tools.NewToolFromFunc(
"get_weather",
"Get weather for a location",
func(req struct{ Location, Units string }) struct{ Temperature float64 } {
return struct{ Temperature float64 }{Temperature: 22.0}
},
)
_ = registry.RegisterTool("get_weather", *getWeather)
Create a Turn with a system block, a user block, and tool config stored in Turn.Data.
seed := &turns.Turn{Data: map[turns.TurnDataKey]any{}}
seed.Data[turns.DataKeyToolConfig] = engine.ToolConfig{
Enabled: true,
ToolChoice: engine.ToolChoiceAuto,
MaxParallelTools: 1,
}
turns.AppendBlock(seed, turns.NewSystemTextBlock(
"You are a helpful assistant with access to tools.",
))
turns.AppendBlock(seed, turns.NewUserTextBlock(s.Prompt))
Run the router and the tool loop concurrently using errgroup. This pattern ensures proper coordination:
eg, groupCtx := errgroup.WithContext(ctx)
// Goroutine 1: Run the event router
// The router blocks until its context is cancelled
eg.Go(func() error { return router.Run(groupCtx) })
// Goroutine 2: Run inference after router is ready
eg.Go(func() error {
// CRITICAL: Wait for router to be ready before publishing events
<-router.Running()
// Attach the sink to context so the engine can publish streaming events
runCtx := events.WithEventSinks(groupCtx, sink)
loop := toolloop.New(
toolloop.WithEngine(eng),
toolloop.WithRegistry(registry),
toolloop.WithLoopConfig(toolloop.NewLoopConfig().WithMaxIterations(5)),
toolloop.WithToolConfig(tools.DefaultToolConfig()),
)
// Run the tool-calling loop (inference → tool calls → tool execution → tool_use blocks → repeat)
updated, err := loop.RunLoop(runCtx, seed)
if err != nil { return err }
// 'updated' now contains the full conversation:
// [system] → [user] → [llm_text] → [tool_call] → [tool_use] → [llm_text (final)]
_ = updated
return nil
})
// Wait for both goroutines to complete
// If either fails, the other is cancelled via groupCtx
if err := eg.Wait(); err != nil { return err }
Why errgroup?
errgroup.WithContext creates a derived context that cancels when any goroutine failseg.Wait() returns the first error from any goroutineassistant: Thinking…
assistant: I will check the current temperature.
call: get_weather {"location":"Paris","units":"celsius"}
result: {"temperature":22}
assistant: It’s about 22°C in Paris right now.
eng.RunInference(ctx, seed) and skip the helper loop.events.NewStructuredPrinter with json or yaml for machine-readable logs.events.WithEventSinks(...) in the goroutine that runs inference."chat").toolloop.NewLoopConfig().WithMaxIterations(n).Working example programs:
geppetto/cmd/examples/streaming-inference/main.gogeppetto/cmd/examples/advanced/openai-tools/main.gogeppetto/cmd/examples/advanced/claude-tools/main.gogeppetto/cmd/examples/advanced/generic-tool-calling/main.goIf you need a full, copy-paste command, use the example apps above as a reference implementation and adapt the snippets here to your project structure.