Choosing between local estimates and provider-native token counting
pinocchio tokens count now supports three counting modes through --count-mode:
estimate: local tokenizer estimate using the existing tiktoken-based pathapi: provider-native token counting through Geppettoauto: try provider-native counting first and fall back to a local estimateLocal estimate:
pinocchio tokens count --count-mode estimate --model gpt-4o-mini prompt.txt
Profile-first provider count:
pinocchio tokens count \
--count-mode api \
--profile gpt-4o-mini \
--profile-registries ~/.config/pinocchio/profiles.yaml \
prompt.txt
OpenAI Responses API count with explicit flags:
pinocchio tokens count \
--count-mode api \
--model gpt-4o-mini \
--ai-api-type openai-responses \
--openai-api-key "$OPENAI_API_KEY" \
prompt.txt
Anthropic count with explicit flags:
pinocchio tokens count \
--count-mode api \
--model claude-sonnet-4-20250514 \
--ai-api-type claude \
--claude-api-key "$ANTHROPIC_API_KEY" \
prompt.txt
Automatic fallback:
pinocchio tokens count --count-mode auto --model gpt-4o-mini prompt.txt
estimate when you want a fast local answer and do not need provider-exact counts.api when the exact provider accounting matters. Prefer --profile plus --profile-registries for normal operator workflows.auto when you prefer provider-exact counts but still want the command to work without credentials or when provider-native counting is unavailable.The command prints the requested mode and the actual count source so fallback behavior is explicit.