TOON vs JSON Performance - Benchmarks & Token Savings

Efficiency Ranking (Accuracy per 1K Tokens)

Overall performance balancing accuracy against token cost

1st

TOON

73.9% accuracy • 2,744 tokens

26.9

2nd

JSON Compact

70.7% accuracy • 3,081 tokens

22.9

3rd

YAML

69.0% accuracy • 3,719 tokens

18.6

4th

JSON

69.7% accuracy • 4,545 tokens

15.3

5th

XML

67.1% accuracy • 5,167 tokens

13.0

Key Findings:

• TOON achieves 73.9% accuracy vs JSON's 69.7%
• TOON uses 39.6% fewer tokens than JSON
• CSV excluded (only supports 109/209 questions - flat data only)

Per-Model Accuracy

Accuracy across 4 LLMs on 209 data retrieval questions

Claude Haiku 4.5

TOON

59.8%125/209

JSON

57.4%120/209

YAML

56.0%117/209

Gemini 2.5 Flash

TOON

87.6%183/209

JSON Compact

82.3%172/209

YAML

79.4%166/209

GPT-5 Nano

TOON

90.9%190/209

JSON Compact

90.9%190/209

JSON

89.0%186/209

Grok 4 Fast (Non-Reasoning)

TOON

57.4%120/209

JSON

55.5%116/209

JSON Compact

54.5%114/209

Methodology

Test Setup

209 data retrieval questions across varied datasets
Tested on 4 major LLMs: Claude, Gemini, GPT-5, Grok
Formats compared: TOON, JSON, JSON Compact, YAML, XML, CSV
Token counting using standard GPT tokenizer

Datasets

Tests included uniform arrays (user lists, product catalogs), nested objects (company hierarchies), mixed data types, and various levels of nesting depth.

Metrics

Accuracy: Percentage of correctly answered questions
Token Count: Total tokens across all test cases
Efficiency Score: Accuracy per 1K tokens (higher is better)

Performance Benchmarks

Efficiency Ranking (Accuracy per 1K Tokens)

Per-Model Accuracy

Methodology

Test Setup

Datasets

Metrics