MySQL JSON Complete Guide — Storing, Querying, and Indexing JSON Data
MySQL 5.7+ has native JSON support with dedicated storage, validation, and a rich set of functions for accessing and modifying JSON data. MySQL 8.0 added multi-valued indexes for JSON arrays, making searches performant. This complete guide covers creating JSON columns, CRUD operations, path expressions, searching arrays, using generated columns for indexing, and when to use JSON versus normalized tables.
MySQL 5.7+
native JSON column type with automatic validation
JSON_EXTRACT
(or -> operator) to access deeply nested values
MySQL 8.0
multi-valued indexes for JSON arrays — enables indexed searches
Generated columns
index specific JSON fields for maximum query performance
JSON Column Basics — Creating and Inserting
MySQL's JSON column type stores JSON in an optimized binary format rather than plain text. This allows faster access to individual values without parsing the full string on every query. MySQL validates JSON on insert and rejects malformed JSON with an error — preventing data corruption.
Binary JSON storage
MySQL's JSON column type stores JSON in an optimized binary format, not plain text. This allows faster access to individual values without parsing the full JSON on every read. MySQL validates JSON on insert and rejects invalid JSON with an error — you cannot store malformed JSON accidentally. JSON columns also auto-normalize whitespace (pretty-printing is stripped on storage).
-- Create table with JSON column
CREATE TABLE products (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255) NOT NULL,
metadata JSON, -- flexible product attributes
tags JSON, -- array of tag strings
pricing JSON -- nested pricing object with currency keys
);
-- Insert JSON data — both string literals and JSON_OBJECT/JSON_ARRAY functions work
INSERT INTO products (name, metadata, tags, pricing) VALUES
('Laptop Pro',
'{"brand": "Dell", "ram": 16, "storage": 512, "color": "black"}',
'["electronics", "computers", "sale"]',
'{"USD": 1299.99, "EUR": 1199.99, "tiers": {"wholesale": 999, "retail": 1299.99}}'
);
-- Using JSON_OBJECT and JSON_ARRAY functions for type safety
INSERT INTO products (name, metadata, tags) VALUES (
'Wireless Mouse',
JSON_OBJECT('brand', 'Logitech', 'wireless', true, 'battery_life_days', 90),
JSON_ARRAY('electronics', 'accessories', 'peripherals')
);
-- MySQL validates JSON on insert — this raises an error:
-- INSERT INTO products (metadata) VALUES ('invalid json {]');
-- ERROR 3140: Invalid JSON text: "Invalid value." at position 14 in value for 'metadata'Reading JSON Values — Extraction Operators
-- JSON_EXTRACT: access nested values using path expressions
SELECT
name,
JSON_EXTRACT(metadata, '$.brand') AS brand, -- returns: "Dell" (with quotes)
metadata->>'$.brand' AS brand_clean, -- ->> removes quotes: Dell
metadata->'$.ram' AS ram_gb, -- returns: 16 (number)
pricing->>'$.USD' AS price_usd, -- returns: 1299.99
pricing->>'$.tiers.wholesale' AS wholesale_price -- nested object access
FROM products;
-- Access array elements by index (0-based)
SELECT
name,
tags->'$[0]' AS first_tag, -- first element
tags->'$[last]' AS last_tag, -- last element (MySQL 8.0+)
tags->'$[1 to 2]' AS middle_tags, -- range (MySQL 8.0+)
JSON_LENGTH(tags) AS tag_count -- number of elements
FROM products;
-- Filter rows by JSON field values
SELECT * FROM products WHERE metadata->>'$.brand' = 'Dell';
-- Compare numeric JSON values (cast required for comparisons)
SELECT * FROM products WHERE CAST(metadata->>'$.ram' AS UNSIGNED) >= 16;
SELECT * FROM products WHERE JSON_EXTRACT(pricing, '$.USD') > 1000;
-- Extract multiple values at once with JSON_EXTRACT and comma-separated paths
SELECT name, JSON_EXTRACT(metadata, '$.brand', '$.ram') AS brand_and_ram
FROM products;
-- Returns: ["Dell", 16]Modifying JSON Data
-- JSON_SET: insert OR update a field (most commonly used)
UPDATE products
SET metadata = JSON_SET(metadata, '$.color', 'silver')
WHERE id = 1;
-- Set multiple fields at once
UPDATE products
SET metadata = JSON_SET(
metadata,
'$.color', 'silver', -- update existing field
'$.weight_kg', 1.8, -- add new field
'$.updated_at', NOW() -- add timestamp
)
WHERE id = 1;
-- JSON_INSERT: only adds if the key doesn't exist (won't overwrite)
UPDATE products
SET metadata = JSON_INSERT(metadata, '$.discount_pct', 10)
WHERE id = 1;
-- JSON_REPLACE: only updates existing keys (won't add new ones)
UPDATE products
SET metadata = JSON_REPLACE(metadata, '$.color', 'blue')
WHERE id = 1;
-- Remove a JSON field
UPDATE products
SET metadata = JSON_REMOVE(metadata, '$.color')
WHERE id = 1;
-- Append to JSON array
UPDATE products
SET tags = JSON_ARRAY_APPEND(tags, '$', 'featured')
WHERE id = 1;
-- Prepend: JSON_ARRAY_INSERT(tags, '$[0]', 'new-tag')
-- Increment a numeric JSON field safely
UPDATE products
SET metadata = JSON_SET(
metadata,
'$.views',
COALESCE(CAST(metadata->>'$.views' AS UNSIGNED), 0) + 1
)
WHERE id = 1;Searching JSON Arrays
-- Find products with a specific tag (string must be quoted in JSON)
SELECT * FROM products WHERE JSON_CONTAINS(tags, '"electronics"');
-- Find products with ANY of these tags (MySQL 8.0.17+)
SELECT * FROM products WHERE JSON_OVERLAPS(tags, '["sale", "clearance", "featured"]');
-- Find products that contain ALL specified tags
SELECT * FROM products
WHERE JSON_CONTAINS(tags, '"electronics"')
AND JSON_CONTAINS(tags, '"sale"');
-- Count elements
SELECT * FROM products WHERE JSON_LENGTH(tags) >= 3;
-- Search within nested object — match partial object
SELECT * FROM products
WHERE JSON_CONTAINS(metadata, '{"brand": "Dell"}');
-- JSON_SEARCH: find the path to a value within JSON
SELECT
name,
JSON_SEARCH(tags, 'one', 'electronics') AS path_of_tag
FROM products;
-- Returns: "$[0]" if 'electronics' is first, NULL if not found
-- 'one' = return first match; 'all' = return all matches as array
-- Multi-valued index for JSON array searches (MySQL 8.0+)
-- Allows JSON_CONTAINS to use an index instead of full table scan
ALTER TABLE products
ADD INDEX idx_tags ((CAST(tags AS CHAR(100) ARRAY)));
-- Verify the index is being used
EXPLAIN SELECT * FROM products WHERE JSON_CONTAINS(tags, '"electronics"');
-- Should show: key: idx_tagsGenerated Columns — Index Specific JSON Fields
Generated columns extract specific JSON values into regular columns, enabling standard B-tree indexes on those values. This is the best pattern for JSON fields you query frequently.
-- Add generated column for a JSON field you frequently filter/sort by
ALTER TABLE products
ADD COLUMN brand VARCHAR(100)
GENERATED ALWAYS AS (metadata->>'$.brand') STORED;
-- Now create a regular index on the generated column
CREATE INDEX idx_brand ON products(brand);
-- Queries using brand now use the index (not a full table scan)
EXPLAIN SELECT * FROM products WHERE brand = 'Dell';
-- type: ref, key: idx_brand — O(log n) lookup
-- Generated column for numeric JSON field with type casting
ALTER TABLE products
ADD COLUMN ram_gb INT UNSIGNED
GENERATED ALWAYS AS (CAST(metadata->>'$.ram' AS UNSIGNED)) STORED;
CREATE INDEX idx_ram ON products(ram_gb);
SELECT * FROM products WHERE ram_gb >= 16 ORDER BY ram_gb;
-- Uses idx_ram — fast sorted access
-- Composite index for multiple JSON-derived fields
ALTER TABLE products
ADD COLUMN price_usd DECIMAL(10,2)
GENERATED ALWAYS AS (CAST(pricing->>'$.USD' AS DECIMAL(10,2))) STORED;
CREATE INDEX idx_brand_price ON products(brand, price_usd);
-- Supports: WHERE brand = 'Dell' AND price_usd < 1500Use generated columns for frequently queried JSON fields
VIRTUAL vs STORED generated columns
VIRTUAL columns are computed on read — no storage overhead but cannot be indexed directly. STORED columns are computed on write and physically stored — use more disk space but can be indexed with standard B-tree indexes. Use STORED for any generated column you want to index.
JSON column performance tips
Avoid SELECT * with large JSON columns — transfer cost is high. Select only the JSON fields you need: SELECT name, metadata->>'$.brand' AS brand. For writes, update specific JSON paths with JSON_SET rather than replacing the entire JSON object to avoid invalidating caches.
JSON vs TEXT/VARCHAR for JSON storage
Always use the JSON type, not TEXT or VARCHAR for JSON. The JSON type: validates syntax on insert, stores in optimized binary format, supports path operators (->, ->>), enables multi-valued indexes. TEXT/VARCHAR stores raw string, requires parsing on every access, and provides no query functions.
NULL vs JSON null
SQL NULL (column has no value) is different from JSON null (the literal null value in JSON). metadata IS NULL checks SQL null. JSON_EXTRACT(metadata, '$.status') = CAST('null' AS JSON) checks JSON null. Be explicit about which you mean in queries and application code.