Token Count Modes

Choosing between local estimates and provider-native token counting

Topictokenstoken-countopenaiclaudepinocchiopinocchiotokenscount-modemodelcodecai-api-type+2

Overview

pinocchio tokens count now supports three counting modes through --count-mode:

estimate: local tokenizer estimate using the existing tiktoken-based path
api: provider-native token counting through Geppetto
auto: try provider-native counting first and fall back to a local estimate

Basic Examples

Local estimate:

pinocchio tokens count --count-mode estimate --model gpt-4o-mini prompt.txt

Profile-first provider count:

pinocchio tokens count \
  --count-mode api \
  --profile gpt-4o-mini \
  --profile-registries ~/.config/pinocchio/profiles.yaml \
  prompt.txt

OpenAI Responses API count with explicit flags:

pinocchio tokens count \
  --count-mode api \
  --model gpt-4o-mini \
  --ai-api-type openai-responses \
  --openai-api-key "$OPENAI_API_KEY" \
  prompt.txt

Anthropic count with explicit flags:

pinocchio tokens count \
  --count-mode api \
  --model claude-sonnet-4-20250514 \
  --ai-api-type claude \
  --claude-api-key "$ANTHROPIC_API_KEY" \
  prompt.txt

Automatic fallback:

pinocchio tokens count --count-mode auto --model gpt-4o-mini prompt.txt

How To Choose

Use estimate when you want a fast local answer and do not need provider-exact counts.
Use api when the exact provider accounting matters. Prefer --profile plus --profile-registries for normal operator workflows.
Use auto when you prefer provider-exact counts but still want the command to work without credentials or when provider-native counting is unavailable.
Keep the explicit provider flags for ad-hoc debugging and CLI testing.

Output Shape

The command prints the requested mode and the actual count source so fallback behavior is explicit.

Estimate output includes the tokenizer codec used.
API output includes the provider and endpoint used.
Auto fallback output includes the provider error that triggered the local estimate.

Token Count Modes

Choosing between local estimates and provider-native token counting

Sections

Token Count Modes

Overview

Basic Examples

How To Choose

Output Shape