ShareWhatsApp X LinkedIn Facebook Reddit

Hidden JSON Errors That Silently Break Your App — Duplicate Keys, BOM, Precision Loss, and More

Most JSON bugs announce themselves: a red SyntaxError, a failed parse, a crash you cannot miss. But some JSON errors are far more dangerous — they succeed silently, corrupt your data without a single exception, and manifest as mysterious production bugs weeks later. This guide covers the six hidden JSON error categories that slip past your parser, your tests, and your code review, and gives you the patterns to detect and prevent them.

Silent

duplicate key loss — later value wins, no error thrown

3 bytes

UTF-8 BOM — invisible characters that break every parser

2^53-1

max safe integer — beyond this, JSON numbers silently round

0 errors

thrown for any of these bugs — they all pass JSON.parse()

Duplicate Keys — Silent Data Loss

The JSON specification says keys in an object should be unique (RFC 8259 §4 uses SHOULD, not MUST). This means duplicate keys are technically allowed at the format level but semantically forbidden. Different parsers handle duplicates differently — and almost none of them warn you.

Every parser behaves differently — and none of them warn you

JavaScript's JSON.parse() silently keeps the last value for a duplicate key. Python's json.loads() also keeps the last value. Some Go and Java parsers keep the first. The RFC says the behaviour is undefined. This means a duplicate key is a silent data loss bug across all standard parsers.

Duplicate keys — last value wins, first value lost silently

❌ Bad

// ❌ Duplicate keys — silent data loss in every standard parser
{
  "status": "active",
  "role": "admin",
  "status": "inactive",
  "email": "user@example.com",
  "role": "user"
}

// After JSON.parse() in JavaScript:
// { status: "inactive", role: "user", email: "user@example.com" }
// The first "status": "active" and first "role": "admin" are gone — silently

Unique keys + pre-parse duplicate detection

✅ Good

// ✅ Unique keys — no ambiguity
{
  "status": "inactive",
  "role": "user",
  "email": "user@example.com"
}

// To detect duplicate keys before parsing:
function findDuplicateKeys(json) {
  const keyCount = {};
  const keyRe = /"((?:[^"\\]|\\.)*)"s*:/g;
  let match;
  while ((match = keyRe.exec(json)) !== null) {
    const key = match[1];
    keyCount[key] = (keyCount[key] || 0) + 1;
  }
  return Object.entries(keyCount)
    .filter(([, count]) => count > 1)
    .map(([key]) => key);
}

findDuplicateKeys(brokenJson);
// → ["status", "role"]  — caught before any silent loss

javascriptProduction duplicate key detection

// Full duplicate key validator — works before parsing
function validateNoDuplicateKeys(jsonText) {
  const errors = [];
  const keyRe = /"((?:[^"\\]|\\.)*)"s*:/g;
  const occurrences = new Map();
  let match;

  while ((match = keyRe.exec(jsonText)) !== null) {
    const key = match[1];
    const line = (jsonText.slice(0, match.index).match(/\n/g) || []).length + 1;
    if (!occurrences.has(key)) {
      occurrences.set(key, []);
    }
    occurrences.get(key).push(line);
  }

  for (const [key, lines] of occurrences) {
    if (lines.length > 1) {
      errors.push({
        key,
        lines,
        message: `Key "${key}" appears ${lines.length}× on lines: ${lines.join(', ')}. All values except the last will be silently discarded.`,
      });
    }
  }

  return errors;
}

// Usage:
const dupes = validateNoDuplicateKeys(jsonString);
if (dupes.length > 0) {
  console.warn('Duplicate keys detected:', dupes);
  // Handle: throw, warn, or show to the user
}

// Python alternative — detect with a custom object_pairs_hook:
import json

def detect_duplicates(pairs):
    keys = [k for k, _ in pairs]
    dupes = {k for k in keys if keys.count(k) > 1}
    if dupes:
        raise ValueError(f"Duplicate JSON keys found: {dupes}")
    return dict(pairs)

json.loads(json_string, object_pairs_hook=detect_duplicates)
# → raises ValueError if any key appears more than once

When duplicate keys appear in AI-generated JSON

AI language models frequently generate duplicate keys in JSON when a long object has many properties. This is especially common when the model is asked to merge or update a JSON object — it sometimes re-includes existing keys with new values without removing the originals. Always check AI-generated JSON for duplicates before using the data.

UTF-8 BOM — Three Invisible Bytes That Break Everything

The UTF-8 Byte Order Mark (BOM) is an invisible character () placed at the very start of some text files to indicate they are UTF-8 encoded. It was inherited from UTF-16 where byte order genuinely matters. In UTF-8 it is meaningless — but it is still written by some editors, particularly on Windows.

The BOM is invisible in most editors and terminals

You cannot see the BOM by looking at the file content. It renders as nothing in most text editors and terminals. But JSON parsers see it as the first character before the opening{, which immediately causes a SyntaxError: Unexpected token . RFC 8259 §8.1 explicitly forbids a BOM at the beginning of a JSON document.

JSON with invisible BOM — crashes every parser

❌ Bad

// ❌ File starts with BOM (invisible as  — 3 bytes: EF BB BF)
{"name": "Alice", "age": 30}

// JSON.parse() throws:
// SyntaxError: Unexpected token  in JSON at position 0

// How to detect a BOM:
const hasBom = text.charCodeAt(0) === 0xFEFF;
console.log(hasBom); // → true

Strip BOM before parsing

✅ Good

// ✅ Strip BOM before parsing
function parseJsonSafe(text) {
  // Remove BOM if present
  const clean = text.startsWith('\uFEFF') ? text.slice(1) : text;
  return JSON.parse(clean);
}

// Or using charCodeAt:
function stripBom(text) {
  return text.charCodeAt(0) === 0xFEFF ? text.slice(1) : text;
}

// Node.js — reading files
const fs = require('fs');
const raw = fs.readFileSync('data.json', 'utf8');
const data = JSON.parse(stripBom(raw));

// Python — strip BOM from file
with open('data.json', encoding='utf-8-sig') as f:  # utf-8-sig strips BOM automatically
    data = json.load(f)

// Editor fix: In VS Code, bottom-right click "UTF-8 with BOM" →
// "Save with Encoding" → "UTF-8" to save without BOM

javascriptDetecting and reporting BOM in production

// Robust JSON file loader — handles BOM, encoding, empty files
async function loadJsonFile(filePath) {
  const { readFile } = await import('fs/promises');

  let raw;
  try {
    raw = await readFile(filePath, 'utf8');
  } catch (e) {
    throw new Error(`Cannot read file ${filePath}: ${e.message}`);
  }

  if (!raw.trim()) {
    throw new Error(`File ${filePath} is empty`);
  }

  // Detect and strip BOM
  const hasBom = raw.charCodeAt(0) === 0xFEFF;
  if (hasBom) {
    console.warn(`Warning: ${filePath} has a UTF-8 BOM — stripping before parse`);
    raw = raw.slice(1);
  }

  try {
    return JSON.parse(raw);
  } catch (e) {
    throw new Error(`JSON parse error in ${filePath}: ${e.message}`);
  }
}

// Detecting BOM in HTTP response:
async function fetchJsonRobust(url) {
  const response = await fetch(url);
  if (!response.ok) throw new Error(`HTTP ${response.status}`);
  const text = await response.text();
  const clean = text.charCodeAt(0) === 0xFEFF ? text.slice(1) : text;
  return JSON.parse(clean);
}

Number Precision Loss — When Large IDs Silently Change

JSON numbers have no size limit in the specification — a JSON number can be arbitrarily large. But the parsers do have limits. JavaScript uses IEEE 754 double-precision floating point for all numbers, which can only represent integers exactly up toNumber.MAX_SAFE_INTEGER = 9007199254740991 (2⁵³ - 1). Numbers beyond this silently round to the nearest representable value.

This is the #1 cause of mysterious ID mismatches in production

Database systems like PostgreSQL, MySQL, and MongoDB routinely use 64-bit integers for primary keys. Twitter (now X), for example, uses Snowflake IDs that exceedNumber.MAX_SAFE_INTEGER. When these IDs are serialized as JSON numbers and parsed by JavaScript, they silently round to a different value. The parsed ID no longer matches the database record. No error is thrown.

Large integer as JSON number — silently rounds

❌ Bad

// ❌ Large integer ID loses precision in JSON parsing
// Server returns:
{ "id": 9007199254740993, "name": "Tweet" }

// After JSON.parse() in JavaScript:
const tweet = JSON.parse('{"id": 9007199254740993, "name": "Tweet"}');
console.log(tweet.id); // → 9007199254740992  ← WRONG! Lost precision silently
console.log(tweet.id === 9007199254740993); // → false!

// This causes:
// - API calls to the wrong resource
// - Database lookups that fail silently
// - Incorrect equality checks
// - Log entries with wrong IDs

Large integer as JSON string — exact precision

✅ Good

// ✅ Store large integers as strings in JSON
// Server returns:
{ "id": "9007199254740993", "name": "Tweet" }

// Now precision is preserved:
const tweet = JSON.parse('{"id": "9007199254740993", "name": "Tweet"}');
console.log(tweet.id); // → "9007199254740993" ✅ exact

// When you need to do math:
const id = BigInt(tweet.id); // → 9007199254740993n (exact)

// Or use a JSON reviver to auto-convert:
function parseWithBigIntIds(json) {
  return JSON.parse(json, (key, value) => {
    // Convert any string that looks like a large integer to BigInt
    if (typeof value === 'string' && /^\d{16,}$/.test(value)) {
      return BigInt(value);
    }
    return value;
  });
}

// Backend: always return large IDs as strings
// Express/Node.js:
res.json({ id: String(user.id), name: user.name });

javascriptDetecting precision loss in JSON responses

// Detect if a JSON number string will lose precision after parsing
function detectPrecisionLoss(jsonText) {
  const numberRe = /:s*(-?d{16,}(?:.d+)?)/g;
  const warnings = [];
  let match;

  while ((match = numberRe.exec(jsonText)) !== null) {
    const numStr = match[1];
    const parsed = parseFloat(numStr);
    const reparsed = String(parsed);

    if (reparsed !== numStr && !numStr.includes('.')) {
      const line = (jsonText.slice(0, match.index).match(/\n/g) || []).length + 1;
      warnings.push({
        original: numStr,
        parsed: reparsed,
        line,
        message: `Number ${numStr} will parse as ${reparsed} — precision loss!`,
      });
    }
  }

  return warnings;
}

// Check API responses before using them:
const json = await response.text();
const precisionWarnings = detectPrecisionLoss(json);
if (precisionWarnings.length > 0) {
  console.warn('Precision loss detected in JSON:', precisionWarnings);
}

// Node.js: use a streaming parser that supports BigInt
// npm install json-bigint
import JSONBig from 'json-bigint';
const parsed = JSONBig.parse(json);  // large integers become BigInt automatically

Control Characters — Invisible Corruption in Strings

JSON strings cannot contain unescaped control characters — characters with Unicode code points U+0000 through U+001F. These include ASCII control codes: null byte (\\u0000), bell (\\u0007), backspace (\\u0008), and others. Most parsers accept them silently and include them in the parsed string — leading to corrupted data that looks normal when printed but contains invisible characters.

Where they come from

User input pasted from rich-text editors, clipboard data from productivity apps, form fields in mobile browsers, copy-pasted content from PDFs or Office documents. The characters are invisible in UIs but present in the underlying string.

Why they cause problems

Databases may reject strings with null bytes (PostgreSQL throws "invalid byte sequence for encoding UTF8: 0x00"). ElasticSearch indexing fails. CSV exports include them. String comparisons fail. The text looks identical when displayed but is not equal.

RFC 8259 §7 says

All characters from U+0000 to U+001F MUST be escaped. So a well-formed JSON string cannot contain raw control characters — they must be written as \u0000 through \u001F. Most parsers accept raw control characters anyway, violating the spec.

The null byte is the worst

The null byte (\u0000) is treated as a string terminator in C, C++, and many C-based libraries. A JSON string containing a null byte will be silently truncated when passed to native code, security systems, or databases written in C.

javascriptSanitizing control characters from JSON strings

// Detect control characters in JSON string values
function findControlChars(jsonText) {
  const controlRe = /[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]/g;
  const issues = [];
  let match;

  while ((match = controlRe.exec(jsonText)) !== null) {
    const charCode = match[0].charCodeAt(0);
    const line = (jsonText.slice(0, match.index).match(/\n/g) || []).length + 1;
    issues.push({
      char: `\\u${charCode.toString(16).padStart(4, '0')}`,
      code: charCode,
      line,
      context: jsonText.slice(Math.max(0, match.index - 20), match.index + 20),
    });
  }

  return issues;
}

// Sanitize string values after parsing
function sanitizeControlChars(value) {
  if (typeof value === 'string') {
    // Remove null bytes (most dangerous), escape other control chars
    return value
      .replace(/\x00/g, '')              // strip null bytes
      .replace(/[\x01-\x08\x0B\x0C\x0E-\x1F\x7F]/g, ''); // strip other controls
  }
  if (Array.isArray(value)) return value.map(sanitizeControlChars);
  if (typeof value === 'object' && value !== null) {
    return Object.fromEntries(
      Object.entries(value).map(([k, v]) => [k, sanitizeControlChars(v)])
    );
  }
  return value;
}

// Full safe parse with control character sanitization:
function safeParseJson(jsonText) {
  const controls = findControlChars(jsonText);
  if (controls.length > 0) {
    console.warn('Control characters in JSON:', controls);
  }
  const parsed = JSON.parse(jsonText); // parse even with control chars
  return sanitizeControlChars(parsed); // then sanitize
}

// For user input — sanitize BEFORE stringifying:
function sanitizeForJson(userInput) {
  if (typeof userInput !== 'string') return userInput;
  return userInput
    .replace(/\x00/g, '')
    .replace(/[\x01-\x1F\x7F]/g, ' '); // replace with space instead of removing
}

const safePayload = JSON.stringify({
  name: sanitizeForJson(req.body.name),
  bio: sanitizeForJson(req.body.bio),
});

Deep Nesting — Stack Overflows That Look Like JSON Errors

The JSON specification places no limit on nesting depth. But every parser implementation has a practical recursion limit. JavaScript engines typically allow several hundred to a few thousand levels of nesting before triggering a stack overflow — which manifests as aRangeError: Maximum call stack size exceeded, not a SyntaxError. Deeply nested JSON from third-party APIs can trigger this unexpectedly.

Used as a Denial-of-Service vector — billion laughs for JSON

Maliciously crafted deeply nested JSON — sometimes called a “JSON bomb” — can consume exponential memory during parsing. A 10-level deep array where each level has 10 elements creates 10 billion leaf nodes. Always limit nesting depth for user-submitted JSON.

javascriptNesting depth detection and limits

// Measure maximum nesting depth before parsing
function measureNestingDepth(jsonText) {
  let depth = 0, maxDepth = 0;
  let inString = false;

  for (let i = 0; i < jsonText.length; i++) {
    const char = jsonText[i];
    if (char === '\\' && inString) { i++; continue; }
    if (char === '"') { inString = !inString; continue; }
    if (inString) continue;

    if (char === '{' || char === '[') { depth++; maxDepth = Math.max(maxDepth, depth); }
    else if (char === '}' || char === ']') depth--;
  }

  return maxDepth;
}

// Safe parse with depth limit
const MAX_NESTING_DEPTH = 50; // reasonable limit for API data

function safeParseWithDepthLimit(jsonText, maxDepth = MAX_NESTING_DEPTH) {
  const depth = measureNestingDepth(jsonText);
  if (depth > maxDepth) {
    throw new Error(`JSON nesting depth ${depth} exceeds limit of ${maxDepth}. Possible JSON bomb.`);
  }
  return JSON.parse(jsonText);
}

// In an Express API — validate user-submitted JSON:
app.use(express.json({
  limit: '10mb',          // size limit
  // Add depth checking in a middleware:
}));

app.use((req, res, next) => {
  if (req.headers['content-type']?.includes('application/json')) {
    try {
      const depth = measureNestingDepth(req.body_raw || '');
      if (depth > MAX_NESTING_DEPTH) {
        return res.status(400).json({ error: 'JSON nesting too deep' });
      }
    } catch {}
  }
  next();
});

Floating Point Representation — When 0.1 + 0.2 Is Stored in JSON

JSON numbers are stored as IEEE 754 double-precision floats by all standard parsers. This means the classic JavaScript puzzle — 0.1 + 0.2 === 0.30000000000000004 — is not just a JavaScript problem. If you generate JSON from floating-point arithmetic and then parse it, you may get slightly different values than expected. Monetary amounts, scientific measurements, and geographic coordinates are all at risk.

Raw floating point arithmetic stored in JSON

❌ Bad

// ❌ Storing floating point arithmetic results directly in JSON
const tax = 0.1;
const price = 2.2;
const total = price + tax; // → 2.3000000000000003

const json = JSON.stringify({ price, tax, total });
// → '{"price":2.2,"tax":0.1,"total":2.3000000000000003}'
// The total stored in JSON is NOT 2.3 — it will always be slightly off

Round, use integer cents, or store as strings

✅ Good

// ✅ Option 1: Round to meaningful precision before stringify
const total = parseFloat((price + tax).toFixed(2)); // → 2.3
JSON.stringify({ price: 2.2, tax: 0.1, total }); // → '{"price":2.2,"tax":0.1,"total":2.3}'

// ✅ Option 2: Store monetary values as integers (cents)
// Never use floats for money — store as integer cents
const priceCents = 220;   // $2.20
const taxCents = 10;      // $0.10
const totalCents = 230;   // $2.30
JSON.stringify({ priceCents, taxCents, totalCents });
// All integers — no precision loss possible

// ✅ Option 3: Use Decimal.js for financial calculations
import Decimal from 'decimal.js';
const total = new Decimal('2.2').plus('0.1'); // exact: 2.3
JSON.stringify({ total: total.toString() });   // store as string: "2.3"

// ✅ Option 4: Round coordinates to meaningful precision
const lat = 37.7749295; // 7 decimal places = ~1cm precision (more than enough)
JSON.stringify({ lat: parseFloat(lat.toFixed(7)) });

Character Encoding — Beyond ASCII

JSON must be Unicode

RFC 8259 §8 requires JSON text to be encoded in UTF-8. UTF-16 and UTF-32 are technically allowed but rarely used. If a JSON file is saved in Latin-1, Windows-1252, or another non-UTF-8 encoding, the result is a corrupted string when parsed as UTF-8 — not an error, just wrong characters.

The replacement character U+FFFD

When a parser encounters invalid UTF-8 byte sequences, it often substitutes the Unicode replacement character (U+FFFD, which displays as "?"). Your parse succeeds but the data contains garbage characters. Always validate encoding before parsing.

HTML entities in JSON strings

"name": "Alice & Bob" — HTML entities like &, <, > are not JSON escapes. They should be decoded before storing in JSON, or stored raw (&) and encoded when rendering to HTML. Storing HTML entities in JSON creates double-encoding bugs.

Emoji and supplementary characters

Emoji above U+FFFF (🎉 U+1F389) are stored as UTF-16 surrogate pairs in JavaScript strings. JSON.stringify() handles them correctly. But older or non-standard parsers may mishandle surrogate pairs. Always test with emoji if your API accepts user-generated content.

javascriptEncoding validation for JSON

// Detect encoding issues after parse — look for replacement characters
function validateEncoding(parsedObject) {
  const issues = [];

  function scan(value, path) {
    if (typeof value === 'string') {
      if (value.includes('\uFFFD')) {
        issues.push({ path, value: value.slice(0, 100), issue: 'Contains replacement character (encoding error)' });
      }
      // Check for HTML entities that should have been decoded
      if (/&(?:amp|lt|gt|quot|apos);/.test(value)) {
        issues.push({ path, issue: 'Contains HTML entities — may be double-encoded' });
      }
    } else if (Array.isArray(value)) {
      value.forEach((item, i) => scan(item, `${path}[${i}]`));
    } else if (typeof value === 'object' && value !== null) {
      Object.entries(value).forEach(([k, v]) => scan(v, path ? `${path}.${k}` : k));
    }
  }

  scan(parsedObject, '');
  return issues;
}

// Node.js — force UTF-8 when reading files
import { readFileSync } from 'fs';
const content = readFileSync('data.json', { encoding: 'utf8' });

// If the file might be in a different encoding, use iconv-lite:
import iconv from 'iconv-lite';
const raw = readFileSync('legacy-data.json'); // Buffer
const utf8 = iconv.decode(raw, 'win1252');    // convert from Windows-1252 to UTF-8
const data = JSON.parse(utf8);

The Complete Hidden Error Checklist

Scan for duplicate keys before parsing

Run a duplicate key scan on any JSON from external sources, AI models, or merged configurations. Use the findDuplicateKeys() function from this guide, or our AI JSON Error Explainer at unblockdevs.com/json-error-explainer which detects duplicates automatically.

Strip BOM from file reads and HTTP responses

Wrap all JSON reads with a stripBom() function. In Python, use encoding="utf-8-sig" when opening files. In Node.js, check text.charCodeAt(0) === 0xFEFF before parsing. This eliminates a common Windows file editing issue.

Store large integers as strings

Any integer ID or value that could exceed 2^53-1 (9007199254740991) must be stored as a JSON string, not a JSON number. Coordinate this between your backend and frontend. Use json-bigint on the Node.js side for automatic handling.

Sanitize control characters in user input

Before storing user-submitted strings in JSON, strip null bytes (\u0000) and other control characters (\u0001-\u001F). These pass the JSON parser but corrupt databases, break CSV exports, and cause security issues in downstream consumers.

Round floating point values to meaningful precision

Never store raw floating point arithmetic results in JSON for monetary or precision-sensitive data. Use integer cents for money, round coordinates to 7 decimal places, and use Decimal.js for financial calculations.

Validate encoding when reading external JSON files

Files from external sources may use Windows-1252, Latin-1, or other encodings. Use iconv-lite (Node.js) or chardet (Python) to detect and convert encoding before parsing. Check for \uFFFD replacement characters in parsed strings.

Use our AI JSON Error Explainer for comprehensive analysis

Our free tool at unblockdevs.com/json-error-explainer detects all of these hidden issues simultaneously — duplicate keys, BOM characters, control characters, and more — plus the standard errors (trailing commas, Python True/False/None, single quotes). Every error is explained in plain English with the exact RFC spec reference. 100% browser-based, nothing is uploaded.

🔍 AI JSON Error Explainer — Free Tool

Paste any JSON and instantly detect duplicate keys, BOM characters, control characters, trailing commas, Python literals, and 10 more error types — with plain-English explanations, RFC spec references, and one-click auto-fix. No upload, no account, 100% browser-based.

Check My JSON Now →

Frequently Asked Questions

ShareWhatsApp X LinkedIn Facebook Reddit