Performance Benchmarks
Comparing TOON against JSON, YAML, CSV, and XML across major LLMs
Efficiency Ranking (Accuracy per 1K Tokens)
Overall performance balancing accuracy against token cost
1st
TOON
73.9% accuracy • 2,744 tokens
26.9
2nd
JSON Compact
70.7% accuracy • 3,081 tokens
22.9
3rd
YAML
69.0% accuracy • 3,719 tokens
18.6
4th
JSON
69.7% accuracy • 4,545 tokens
15.3
5th
XML
67.1% accuracy • 5,167 tokens
13.0
Key Findings:
- • TOON achieves 73.9% accuracy vs JSON's 69.7%
- • TOON uses 39.6% fewer tokens than JSON
- • CSV excluded (only supports 109/209 questions - flat data only)
Per-Model Accuracy
Accuracy across 4 LLMs on 209 data retrieval questions
Claude Haiku 4.5
TOON
59.8%125/209
JSON
57.4%120/209
YAML
56.0%117/209
Gemini 2.5 Flash
TOON
87.6%183/209
JSON Compact
82.3%172/209
YAML
79.4%166/209
GPT-5 Nano
TOON
90.9%190/209
JSON Compact
90.9%190/209
JSON
89.0%186/209
Grok 4 Fast (Non-Reasoning)
TOON
57.4%120/209
JSON
55.5%116/209
JSON Compact
54.5%114/209
Methodology
Test Setup
- 209 data retrieval questions across varied datasets
- Tested on 4 major LLMs: Claude, Gemini, GPT-5, Grok
- Formats compared: TOON, JSON, JSON Compact, YAML, XML, CSV
- Token counting using standard GPT tokenizer
Datasets
Tests included uniform arrays (user lists, product catalogs), nested objects (company hierarchies), mixed data types, and various levels of nesting depth.
Metrics
- Accuracy: Percentage of correctly answered questions
- Token Count: Total tokens across all test cases
- Efficiency Score: Accuracy per 1K tokens (higher is better)