Build a read-only scoped SQLite tool with `geppetto/pkg/inference/tools/scopeddb`, choose prebuilt or lazy registration, and keep queries safe.
This tutorial explains how to expose a scoped, read-only SQLite snapshot as a Geppetto tool using github.com/go-go-golems/geppetto/pkg/inference/tools/scopeddb.
The core idea is simple:
scopeddb wraps that SQLite database as a safe query toolThis pattern is useful when the model needs flexible read access to structured data, but you do not want to give it a live application database, write access, or unrestricted SQL.
Ordinary Geppetto tools are great when you already know the exact function shape: get_weather(location) or lookup_customer(id). They become awkward when the model needs exploratory access to a small, changing, structured dataset.
Examples:
You could build a separate tool for every query shape, but that scales badly. A scoped tool database gives the model one query tool backed by a temporary SQLite snapshot that only contains the data for the current scope.
That gives you:
SELECT and WITH queries are allowedThink of the system as a three-layer stack:
Application scope
account_id="northwind"
run_id="run-123"
transcript_id="tr-42"
|
v
Scoped snapshot builder
create SQLite schema
copy only rows for that scope
return DB handle + Meta
|
v
Geppetto tool
query_support_history(sql, params)
validates SQL
enforces readonly access
returns rows as JSON
Another way to say it:
scopeddb owns SQLite bootstrapping, query validation, and tool definitionThe main files live in pkg/inference/tools/scopeddb:
schema.go
DatasetSpecBuildInMemoryBuildFiletool.go
RegisterPrebuiltNewLazyRegistrarquery.go
QueryInputQueryOutputQueryOptionsQueryRunnerdescription.go
helpers.go
The package is intentionally small. Your application is expected to bring the domain-specific parts:
DatasetSpecDatasetSpec[Scope, Meta] is the central configuration object.
type DatasetSpec[Scope any, Meta any] struct {
InMemoryPrefix string
SchemaLabel string
SchemaSQL string
AllowedObjects []string
Tool ToolDefinitionSpec
DefaultQuery QueryOptions
Materialize func(ctx context.Context, dst *sql.DB, scope Scope) (Meta, error)
}
What each field means:
InMemoryPrefix
SchemaLabel
SchemaSQL
AllowedObjects
Tool
DefaultQuery
ORDER BY policyMaterialize
MetaMetaMeta exists for application-owned information produced while materializing the scoped database. scopeddb does not interpret it. Your code can use it for diagnostics, status messages, telemetry, or rendering.
Good Meta examples:
Bad Meta examples:
If the tool itself needs a table, put that information in SQLite, not in Meta.
The Tool field on DatasetSpec has this shape:
type ToolDefinitionSpec struct {
Name string
Description ToolDescription
Tags []string
Version string
}
type ToolDescription struct {
Summary string
StarterQueries []string
Notes []string
}
This is application-authored input used to build the final Geppetto tools.ToolDefinition.
Important separation:
ToolDescription is model-facing proseSchemaSQL is machine-consumed schemaQueryInput is the callable tool argument schemaDo not collapse these into one field. They serve different consumers.
The generated tool takes a small input schema:
type QueryInput struct {
SQL string `json:"sql"`
Params []string `json:"params,omitempty"`
}
And returns:
type QueryOutput struct {
Columns []string
Rows []map[string]any
Count int
Truncated bool
Error string
}
Why are Params strings instead of []any?
[]any shape[]string produces a provider-safe schema with params.items.type = "string"scopeddb is not "SQLite access for the model". It is a constrained query surface.
The runner in query.go enforces several rules:
SELECT or WITH queries are allowedAllowedObjectsRequireOrderBy forces deterministic result orderingThat means this is the intended SQL style:
SELECT ticket_id, opened_at, status
FROM tickets
WHERE account_id = ?
ORDER BY opened_at DESC
LIMIT 10
And these are intentionally rejected:
DELETE FROM ticketsSELECT * FROM tickets; SELECT * FROM commentsSELECT * FROM sqlite_masterThere are two main ways to expose a scoped tool.
Use RegisterPrebuilt when your application already built the scoped database and wants the tool to query that specific handle.
request starts
-> app resolves scope
-> app builds snapshot once
-> app registers tool against that DB
-> model queries it many times during this run
This is the best option when:
Meta before inference startsUse NewLazyRegistrar when you want the tool registration to be cheap and defer snapshot construction until the tool is actually called.
request starts
-> app registers lazy tool
-> model may or may not call it
-> on tool call:
resolve scope from context
build in-memory snapshot
run query
cleanup
This is the best option when:
Tradeoff:
NewLazyRegistrar rebuilds the in-memory DB for each tool callRegisterPrebuilt lets one snapshot support multiple SQL callsStart by defining a scope type and a metadata type.
type SupportScope struct {
AccountID string
}
type SupportMeta struct {
AccountID string
TicketCount int
CommentCount int
}
Next, define the dataset spec.
var supportHistorySpec = scopeddb.DatasetSpec[SupportScope, SupportMeta]{
InMemoryPrefix: "support_history",
SchemaLabel: "support history schema",
SchemaSQL: `
CREATE TABLE scope(
account_id TEXT PRIMARY KEY
);
CREATE TABLE tickets(
ticket_id TEXT PRIMARY KEY,
account_id TEXT NOT NULL,
subject TEXT NOT NULL,
status TEXT NOT NULL,
opened_at TEXT NOT NULL
);
CREATE TABLE comments(
comment_id TEXT PRIMARY KEY,
ticket_id TEXT NOT NULL,
author TEXT NOT NULL,
body TEXT NOT NULL,
created_at TEXT NOT NULL
);
CREATE VIEW latest_tickets AS
SELECT ticket_id, subject, status, opened_at
FROM tickets;
`,
AllowedObjects: []string{
"scope",
"tickets",
"comments",
"latest_tickets",
},
Tool: scopeddb.ToolDefinitionSpec{
Name: "query_support_history",
Description: scopeddb.ToolDescription{
Summary: "Query support ticket history for the currently selected account.",
StarterQueries: []string{
"SELECT ticket_id, subject, status FROM latest_tickets ORDER BY opened_at DESC LIMIT 10",
"SELECT author, body FROM comments WHERE ticket_id = ? ORDER BY created_at ASC",
},
Notes: []string{
"Use ? placeholders with params instead of inline literal values when filtering by ids.",
"Prefer ORDER BY for deterministic results.",
},
},
Tags: []string{"sqlite", "support", "scopeddb"},
Version: "v1",
},
DefaultQuery: scopeddb.QueryOptions{
MaxRows: 100,
MaxColumns: 32,
MaxCellChars: 1000,
Timeout: 5 * time.Second,
RequireOrderBy: true,
},
Materialize: func(ctx context.Context, dst *sql.DB, scope SupportScope) (SupportMeta, error) {
// 1. Copy only the rows for scope.AccountID
// 2. Insert them into the scoped tables
// 3. Return counts or other app-owned metadata
return SupportMeta{
AccountID: scope.AccountID,
TicketCount: 42,
CommentCount: 128,
}, nil
},
}
If you want one prebuilt scoped DB per request, use BuildInMemory:
scope := SupportScope{AccountID: "northwind"}
buildResult, err := scopeddb.BuildInMemory(ctx, supportHistorySpec, scope)
if err != nil {
return err
}
defer func() { _ = buildResult.Cleanup() }()
fmt.Printf("loaded %d tickets for %s\n", buildResult.Meta.TicketCount, buildResult.Meta.AccountID)
The return type is:
type BuildResult[Meta any] struct {
DB *sql.DB
Meta Meta
Cleanup func() error
}
That gives you three things:
DB
Meta
Cleanup
If you want to inspect the snapshot on disk during development or export it for debugging, use BuildFile instead:
buildResult, err := scopeddb.BuildFile(ctx, "/tmp/support-history.sqlite", supportHistorySpec, scope)
BuildFile is useful for:
It is usually not the default runtime path for request-scoped tools.
Once you have a built DB, register the tool into a Geppetto registry.
registry := tools.NewInMemoryToolRegistry()
if err := scopeddb.RegisterPrebuilt(
registry,
supportHistorySpec,
buildResult.DB,
supportHistorySpec.DefaultQuery,
); err != nil {
return err
}
What RegisterPrebuilt does:
QueryRunnerSummary, Notes, StarterQueries, and AllowedObjectsToolDefinitionspec.Tool.NameAfter that, the tool loop sees a normal Geppetto tool. scopeddb is an implementation detail.
If you want the DB to be constructed only when the tool is called, use NewLazyRegistrar.
type scopeKey struct{}
registrar := scopeddb.NewLazyRegistrar(
supportHistorySpec,
func(ctx context.Context) (SupportScope, error) {
scope, ok := ctx.Value(scopeKey{}).(SupportScope)
if !ok {
return SupportScope{}, fmt.Errorf("support scope missing from context")
}
return scope, nil
},
supportHistorySpec.DefaultQuery,
)
registry := tools.NewInMemoryToolRegistry()
if err := registrar(registry); err != nil {
return err
}
Important behavior:
QueryOutput.Error so the model sees a normal tool result payloadThis is the same basic pattern used for request-scoped runtime registrars in Geppetto-based applications.
From the model's point of view, a scoped DB tool is just another registered tool.
loop := toolloop.New(
toolloop.WithEngine(eng),
toolloop.WithRegistry(registry),
toolloop.WithLoopConfig(toolloop.NewLoopConfig().WithMaxIterations(5)),
toolloop.WithToolConfig(tools.DefaultToolConfig()),
)
updatedTurn, err := loop.RunLoop(ctx, seedTurn)
if err != nil {
return err
}
_ = updatedTurn
The flow looks like this:
user asks question
|
v
model emits tool_call(query_support_history, {sql, params})
|
v
scopeddb validates query and runs it
|
v
tool result returns rows/columns/count
|
v
model answers using the returned rows
For most applications, keep the responsibilities split like this:
Pseudocode:
// package supporthistory
var Spec = scopeddb.DatasetSpec[SupportScope, SupportMeta]{...}
func ResolveScope(ctx context.Context) (SupportScope, error) { ... }
func NewRegistrar() func(tools.ToolRegistry) error {
return scopeddb.NewLazyRegistrar(Spec, ResolveScope, Spec.DefaultQuery)
}
This keeps scopeddb generic and your domain logic local to your own package.
The final tool description shown to the model is not only the one-line summary. BuildDescription(...) composes:
SummaryNotesORDER BY hint when requiredStarterQueriesThat matters because provider tool descriptions are often the only place the model sees usage hints before generating SQL.
Good starter queries:
ORDER BY?Bad starter queries:
AllowedObjectsYou should test three layers separately.
Test that the schema opens and materializes correctly:
BuildInMemory(...) succeedsMeta is populated correctlyTest the runner directly:
SELECT workssqlite_master and disallowed tables are rejectedORDER BY enforcement works when enabledTest the registered tool definition:
The existing tests in pkg/inference/tools/scopeddb/query_test.go, pkg/inference/tools/scopeddb/schema_test.go, and pkg/inference/tools/scopeddb/tool_test.go are the best starting point.
func RegisterSupportHistoryTool(ctx context.Context, reg tools.ToolRegistry, accountID string) (scopeddb.BuildResult[SupportMeta], error) {
scope := SupportScope{AccountID: accountID}
buildResult, err := scopeddb.BuildInMemory(ctx, supportHistorySpec, scope)
if err != nil {
return scopeddb.BuildResult[SupportMeta]{}, err
}
if err := scopeddb.RegisterPrebuilt(reg, supportHistorySpec, buildResult.DB, supportHistorySpec.DefaultQuery); err != nil {
_ = buildResult.Cleanup()
return scopeddb.BuildResult[SupportMeta]{}, err
}
return *buildResult, nil
}
That is often the cleanest first implementation:
AllowedObjectsMeta when the model actually needs it in a tableCleanupORDER BY while also expecting stable row ordering| Symptom | Likely Cause | Fix |
|---|---|---|
array schema items is not an object from provider schema validation | Tool input used an unsupported array item schema | Use the current QueryInput shape with Params []string |
query references disallowed table/view | Missing entry in AllowedObjects or wrong table name in SQL | Add the table/view to AllowedObjects or fix the query/examples |
only SELECT queries are allowed | Model produced DELETE, INSERT, or another non-read query | Improve tool description and starter queries; keep validator strict |
| Tool keeps rebuilding the database | Using NewLazyRegistrar | Switch to RegisterPrebuilt if one request should share one snapshot |
| Model gives unstable answers for "latest" rows | No deterministic ordering | Set RequireOrderBy: true and include ordered starter queries |
| Large text fields make results noisy | Cell truncation too loose | Lower MaxCellChars in QueryOptions |
If you want a full runnable demo, see the Pinocchio example at pinocchio/cmd/examples/scopeddb-tui-demo.
That example shows:
BuildInMemory(...)Meta usage in the UIRegisterPrebuilt(...)