Back to Blog

JSON Format & Standards: Complete Guide

RFC 8259 Compliance, Syntax Rules & Production-Grade Fixing

What is JSON Format?

JSON (JavaScript Object Notation) is a lightweight data-interchange format defined by RFC 8259. It's human-readable, language-independent, and widely used for APIs, configuration files, and data storage. Understanding JSON format and standards is crucial for developers working with modern web applications.

This comprehensive guide covers everything you need to know about JSON format, standards, syntax rules, common violations, and how to fix malformed JSON. Whether you're building APIs, working with data pipelines, or debugging JSON parsing errors, this guide will help you master JSON format and standards.

Table of Contents

1. What "JSON Fixing" Actually Means

⚠️ Important Distinction

JSON Fixing is NOT schema validation or data correction. It fixes format, not business logic.

A JSON Fixer is a tool that:

  1. Detects syntactic violations of the JSON standard (RFC 8259)
  2. Detects structural inconsistencies like unclosed braces or brackets
  3. Applies minimal, deterministic corrections that preserve the intended meaning
  4. Outputs valid JSON without changing the data's semantic intent

Golden Rule of JSON Fixing

"Fix structure, preserve intent, never invent data"

2. Canonical JSON Rules (RFC 8259 - The Ground Truth)

A JSON fixer must enforce all of these rules to be RFC 8259 compliant. These are the absolute requirements that define valid JSON format.

2.1 Top-Level Structure Rules

✅ Valid JSON must be exactly one of:

  • Object: { ... }
  • Array: [ ... ]

❌ Invalid (not valid JSON):

"hello"
123
true

Fix Strategy:

  • If multiple roots → wrap in array: ["hello", 123, true]
  • If plain text → quote it: "hello"
  • If key:value without braces → wrap in object: {"key": "value"}

2.2 Object Rules

Valid Object Example:

{
  "key": "value"
}
RuleRequired
Keys must be strings
Keys must be quoted with "
Colon between key and value
Comma between pairs
No trailing comma

1. Unquoted Keys

{ name: "John" }

✅ Fix:

{ "name": "John" }

2. Trailing Comma

{ "a": 1, }

✅ Fix: remove last comma

{ "a": 1 }

3. Missing Colon

{ "a" 1 }

✅ Fix: insert :

{ "a": 1 }

4. Duplicate Keys

{ "a": 1, "a": 2 }

✅ Fix strategy (must choose one):

  • Keep last value (most parsers): { "a": 2 }
  • OR flag as non-fixable ambiguity

2.3 Array Rules

Valid Array Example:

[1, 2, 3]

Trailing Comma

[1, 2, 3,]

✅ Fix: remove comma

[1, 2, 3]

Missing Comma

[1 2 3]

✅ Fix: infer comma between values

[1, 2, 3]

2.4 String Rules (Most Error-Prone)

Valid String Example:

"Hello\nWorld"
RuleRequired
Must use double quotes
Escape internal "
No raw newlines
Valid escape sequences only

Escape Sequences Allowed:

\" \\ \/ \b \f \n \r \t \uXXXX

Single Quotes

'hello'

✅ Fix:

"hello"

Unescaped Quotes

"She said "hi""

✅ Fix:

"She said \"hi\""

Raw Newline

"hello
world"

✅ Fix:

"hello\nworld"

2.5 Number Rules

Valid Numbers:

-12
3.14
1e5
InvalidFix
01 (leading zero)1
1. (trailing dot)1.0
NaN, Infinitynull

2.6 Boolean & Null Rules

✅ Only allowed (case-sensitive):

true
false
null

❌ Invalid:

True
FALSE
None

✅ Fix:

true
false
null

3. Structural Integrity Rules

3.1 Balanced Tokens

Every opening token must have a corresponding closing token:

  • { must have }
  • [ must have ]
  • " must be closed

Fix Strategy:

  • Track stack of openings
  • Auto-insert missing closers at logical boundary
  • Prefer closing at end of structure

3.2 Ordering Rules

JSON does not require ordering, but fixers should:

  • ✅ Preserve original order
  • ✅ Never reorder unless explicitly configured

4. Parsing-Based Detection Logic (Core Algorithm)

Step 1: Tokenize

Recognize all JSON tokens:

{ } [ ] , : " string number true false null

Step 2: Stateful Parse

Maintain parsing state:

  • Stack of objects/arrays
  • Current expected token
  • String escape state

Step 3: Error Classification

Each error must be classified as:

Missing token

Extra token

Invalid token

Ambiguous intent

5. Logical Fixing Rules (Decision Tree)

5.1 Safe Auto-Fix (Always Fix)

These fixes are deterministic and safe to apply automatically:

  • ✅ Trailing commas
  • ✅ Unquoted keys
  • ✅ Single quotes
  • ✅ Missing closing braces
  • ✅ Invalid booleans/null
  • ✅ Escaping strings

5.2 Heuristic Fix (Fix with Assumptions)

These fixes require assumptions and should be applied carefully:

  • ⚠️ Missing commas
  • ⚠️ Missing colons
  • ⚠️ Root-level fragments

Rule:

Apply minimal insertion that restores validity

5.3 Non-Fixable (Must Report)

These errors cannot be automatically fixed:

Conflicting structures:

{ "a": [1, 2 }

Semantic ambiguity:

{ "a" "b" "c" }

Fixer must stop and explain, not guess.

6. Error Classification Model

Every issue must map to one category. Here's a production-grade error classification system:

CodeCategoryDescription
E001Unclosed structureMissing closing brace or bracket
E002Trailing commaComma before closing bracket/brace
E003Missing commaNo comma between values
E004Missing colonNo colon between key and value
E005Invalid stringUnescaped quotes, raw newlines
E006Invalid numberNaN, Infinity, leading zeros
E007Invalid literalTrue, FALSE, None instead of true, false, null
E008Unquoted keyObject key without quotes
E009Extra tokenUnexpected token in context
E010Root violationInvalid top-level structure

Error Object Structure:

{
  "code": "E003",
  "position": 124,
  "expected": ",",
  "found": "STRING",
  "context": "ARRAY"
}

7. Production-Grade Algorithm Design

High-Level Architecture

┌─────────────┐
│ Raw Input │
└─────┬───────┘
┌─────────────┐
│ Lexer │ ← tolerant tokenizer
└─────┬───────┘
┌─────────────┐
│ Recovering │ ← state machine + stack
│ Parser │
└─────┬───────┘
┌─────────────┐
│ Error Model │ ← classified issues
└─────┬───────┘
┌─────────────┐
│ Fix Engine │ ← rule-based mutations
└─────┬───────┘
┌─────────────┐
│ Validator │ ← strict RFC 8259
└─────┬───────┘
┌─────────────┐
│ Output JSON │
└─────────────┘

Key Principle:

The fixer never edits blindly. All fixes come from parser-detected expectations.

Fix Engine Execution Order

Fixes must be applied in this specific order:

  1. String normalization
  2. Literal normalization
  3. Structural closure
  4. Comma/colon insertion
  5. Trailing comma removal
  6. Root correction

Why this order?

Early fixes change token boundaries; late fixes assume stable structure.

Core Pseudocode

tokens = lex(input)
parser = new RecoveringParser()
errors = []

for token in tokens:
    expected = parser.expected()
    
    if token violates expected:
        error = classify(token, expected)
        errors.append(error)
        
        if fixable(error):
            applyFix(token, error)
        else:
            abort(error)
    
    parser.consume(token)

if parser.stack not empty:
    closeStructures(parser.stack)

output = serialize(parser.ast)

assert strictParse(output)
return output

8. Hard Edge Cases (Handled Explicitly)

Case 1: Mixed Structures

{ "a": [1, 2 }

❌ Abort — ambiguous closure

The fixer cannot determine whether to close the array or object first. This requires manual intervention.

Case 2: Duplicate Keys

{ "a": 1, "a": 2 }

✔ Keep last, emit warning

Most JSON parsers keep the last value for duplicate keys. The fixer should do the same but warn the user.

Case 3: Fragmented Root

"a": 1,
"b": 2

✔ Wrap in

When properties exist without an object wrapper, automatically wrap them in an object.

9. JSON Best Practices

Fixing Policy (Non-Negotiable Rules)

  • ✅ Never reorder keys
  • ✅ Never invent keys or values
  • ✅ Never change numeric magnitude
  • ✅ Never coerce types unless invalid
  • ✅ Never "fix" schema violations

Validation After Fix

After applying fixes, always:

  1. Re-parse entire output
  2. Ensure zero syntax errors
  3. Ensure no fixer-introduced invalid JSON
  4. Optional: pretty-print or minify

Try Our JSON Fixer Tool

Use our free online JSON Fixer to automatically detect and fix JSON syntax errors. It follows RFC 8259 standards and provides detailed error reports.

Try JSON Fixer Now

Related JSON Tools