Performance Benchmarks

Comparing TOON against JSON, YAML, CSV, and XML across major LLMs

Efficiency Ranking (Accuracy per 1K Tokens)

Overall performance balancing accuracy against token cost
1st
TOON
73.9% accuracy • 2,744 tokens
26.9
2nd
JSON Compact
70.7% accuracy • 3,081 tokens
22.9
3rd
YAML
69.0% accuracy • 3,719 tokens
18.6
4th
JSON
69.7% accuracy • 4,545 tokens
15.3
5th
XML
67.1% accuracy • 5,167 tokens
13.0

Key Findings:

  • • TOON achieves 73.9% accuracy vs JSON's 69.7%
  • • TOON uses 39.6% fewer tokens than JSON
  • • CSV excluded (only supports 109/209 questions - flat data only)

Per-Model Accuracy

Accuracy across 4 LLMs on 209 data retrieval questions

Claude Haiku 4.5
TOON
59.8%125/209
JSON
57.4%120/209
YAML
56.0%117/209
Gemini 2.5 Flash
TOON
87.6%183/209
JSON Compact
82.3%172/209
YAML
79.4%166/209
GPT-5 Nano
TOON
90.9%190/209
JSON Compact
90.9%190/209
JSON
89.0%186/209
Grok 4 Fast (Non-Reasoning)
TOON
57.4%120/209
JSON
55.5%116/209
JSON Compact
54.5%114/209

Methodology

Test Setup

  • 209 data retrieval questions across varied datasets
  • Tested on 4 major LLMs: Claude, Gemini, GPT-5, Grok
  • Formats compared: TOON, JSON, JSON Compact, YAML, XML, CSV
  • Token counting using standard GPT tokenizer

Datasets

Tests included uniform arrays (user lists, product catalogs), nested objects (company hierarchies), mixed data types, and various levels of nesting depth.

Metrics

  • Accuracy: Percentage of correctly answered questions
  • Token Count: Total tokens across all test cases
  • Efficiency Score: Accuracy per 1K tokens (higher is better)