JSON Schema Generator Tutorial — Validate Any JSON Structure
JSON Schema lets you define the exact structure, types, and constraints for your JSON data. Once defined, you can validate any JSON against it automatically — catching missing required fields, wrong types, invalid formats, and out-of-range values before they cause bugs. This tutorial covers writing schemas from scratch, all key keywords with examples, validating with Ajv in JavaScript, auto-generating schemas from sample data in Python, and common patterns for API contracts, form validation, and OpenAPI integration.
Draft 2020-12
latest JSON Schema specification
Ajv
most popular JSON Schema validator for JavaScript
Auto-generate
tools that create schema from sample JSON instantly
OpenAPI 3.1
fully compatible with JSON Schema Draft 2020-12
JSON Schema Basics — What It Is and Why It Matters
What JSON Schema does
A JSON Schema is a JSON document that describes the structure of another JSON document. It defines what types are allowed, which fields are required, what values are valid, and how nested objects and arrays are structured. Validation engines check JSON against the schema and report detailed errors when data doesn't match — with the exact field path and what was wrong.
Without JSON Schema, you write manual validation logic: if (!data.email) throw Error(),if (data.age < 0) throw Error(), scattered throughout your codebase. With JSON Schema, you define all constraints once in a single declarative document, and any compatible validator enforces all of them automatically — in JavaScript, Python, Java, Go, or any other language.
Complete Schema Example — User Object
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/schemas/user.json",
"title": "User",
"description": "A user object in our system",
"type": "object",
"required": ["id", "email", "name"],
"properties": {
"id": {
"type": "integer",
"minimum": 1,
"description": "Unique user identifier"
},
"email": {
"type": "string",
"format": "email",
"description": "User email address"
},
"name": {
"type": "string",
"minLength": 1,
"maxLength": 100
},
"age": {
"type": "integer",
"minimum": 0,
"maximum": 150
},
"role": {
"type": "string",
"enum": ["admin", "user", "moderator"]
},
"tags": {
"type": "array",
"items": { "type": "string" },
"uniqueItems": true,
"maxItems": 20
},
"address": {
"type": "object",
"required": ["city", "country"],
"properties": {
"street": { "type": "string" },
"city": { "type": "string" },
"country": { "type": "string", "minLength": 2, "maxLength": 2 }
},
"additionalProperties": false
},
"createdAt": {
"type": "string",
"format": "date-time"
},
"score": {
"type": ["number", "null"],
"minimum": 0,
"maximum": 100
}
},
"additionalProperties": false
}Validating JSON with Ajv (JavaScript)
import Ajv from 'ajv';
import addFormats from 'ajv-formats'; // adds email, date-time, uri, etc.
const ajv = new Ajv({ allErrors: true }); // show ALL errors (not just first)
addFormats(ajv);
const userSchema = {
type: 'object',
required: ['id', 'email', 'name'],
properties: {
id: { type: 'integer', minimum: 1 },
email: { type: 'string', format: 'email' },
name: { type: 'string', minLength: 1, maxLength: 100 },
role: { type: 'string', enum: ['admin', 'user', 'moderator'] },
age: { type: 'integer', minimum: 0 },
},
additionalProperties: false,
};
// Compile schema once, validate many times (compile is expensive)
const validate = ajv.compile(userSchema);
// ✅ Valid data
const validUser = { id: 1, email: 'alice@example.com', name: 'Alice', role: 'admin' };
console.log(validate(validUser)); // true
// ❌ Invalid data — multiple errors
const invalidUser = { id: 0, email: 'not-an-email', name: '', unknownField: true };
console.log(validate(invalidUser)); // false
console.log(validate.errors);
// [
// { instancePath: '/id', message: 'must be >= 1' },
// { instancePath: '/email', message: 'must match format "email"' },
// { instancePath: '/name', message: 'must NOT have fewer than 1 characters' },
// { instancePath: '', keyword: 'additionalProperties', params: { additionalProperty: 'unknownField' } }
// ]
// Utility function for clean error messages:
function validateUser(data) {
const valid = validate(data);
if (!valid) {
const errors = validate.errors.map(e => `${e.instancePath || 'root'}: ${e.message}`);
throw new Error('Validation failed:\n' + errors.join('\n'));
}
return data;
}
// Express.js middleware using Ajv:
function validateBody(schema) {
const validate = ajv.compile(schema);
return (req, res, next) => {
if (validate(req.body)) {
next();
} else {
res.status(400).json({ errors: validate.errors });
}
};
}
app.post('/users', validateBody(userSchema), createUserHandler);Auto-Generate Schema from Sample JSON
# pip install genson
from genson import SchemaBuilder
import json
# Sample JSON data representing a typical response
sample_data = {
"id": 123,
"name": "Alice Johnson",
"email": "alice@example.com",
"scores": [95, 87, 92],
"active": True,
"address": {
"city": "Boston",
"country": "USA"
}
}
# Build schema from sample
builder = SchemaBuilder()
builder.add_object(sample_data)
schema = builder.to_schema()
print(json.dumps(schema, indent=2))
# {
# "$schema": "http://json-schema.org/schema#",
# "type": "object",
# "properties": {
# "id": {"type": "integer"},
# "name": {"type": "string"},
# "email": {"type": "string"},
# "scores": {"type": "array", "items": {"type": "integer"}},
# "active": {"type": "boolean"},
# "address": {
# "type": "object",
# "properties": {
# "city": {"type": "string"},
# "country": {"type": "string"}
# },
# "required": ["city", "country"]
# }
# },
# "required": ["id", "name", "email", "scores", "active", "address"]
# }
# Add multiple samples to handle union types and optional fields
builder2 = SchemaBuilder()
builder2.add_object({"id": 123, "name": "Alice", "role": "admin"})
builder2.add_object({"id": "abc", "name": "Bob"}) # id can be string too
builder2.add_object({"id": 456, "name": "Carol", "age": 30}) # age is optional
schema2 = builder2.to_schema()
# "id" becomes {"type": ["integer", "string"]}
# "age" is absent from required (only present in 1 of 3 samples)
print(json.dumps(schema2, indent=2))Key JSON Schema Keywords Reference
type
"string", "number", "integer", "boolean", "array", "object", "null". Array for nullable: ["string", "null"] allows the field to be a string or null. "integer" only allows whole numbers; "number" allows decimals too.
required
Array of property names that MUST be present in the object. Missing required property = validation error. Optional fields are simply omitted from "required" — no separate keyword needed.
enum
Restricts to a specific set of values: "enum": ["active", "inactive", "pending"]. Works for any type. All values must match the specified type. Case-sensitive for strings.
pattern
Regular expression for string validation. "pattern": "^[0-9]{3}-[0-9]{4}$" validates phone numbers. Uses ECMAScript regex syntax. Anchors (^ and $) recommended to match full string.
format
Semantic string validation: "email", "date-time", "date", "uri", "uuid", "ipv4", "hostname". Requires ajv-formats plugin in Ajv. Draft 2020-12 treats format as annotation only unless configured otherwise.
additionalProperties
Set to false to reject properties not listed in "properties". Critical for security — prevents extra unknown fields from being accepted. Can also be a schema to validate additional properties.
$ref and $defs
Reuse schema definitions. "$defs": {"Address": {...}} defines a reusable schema. "$ref": "#/$defs/Address" references it. Eliminates copy-paste between related schemas. Supports recursive schemas too.
oneOf / anyOf / allOf
"anyOf": must match at least one schema. "oneOf": must match exactly one schema. "allOf": must match all schemas (merging). Used for union types and schema composition.
Common Validation Patterns
Nullable fields, optional fields, and union types
Missing null type, additionalProperties surprises
// ❌ Trying to use null without declaring it in type
{
"properties": {
"middleName": { "type": "string" } // fails if middleName is null
}
}
// null → fails: 'must be string'
// ❌ Forgetting that additionalProperties: false blocks extra fields
{
"properties": { "id": {"type": "integer"} },
"additionalProperties": false
}
// {id: 1, extra: "data"} → fails: 'must NOT have additional properties'Explicit nullable types, optional vs required, discriminated unions
// ✅ Nullable field — type as array
{
"properties": {
"middleName": { "type": ["string", "null"] },
"score": { "type": ["number", "null"], "minimum": 0 }
}
}
// ✅ Optional field — just omit from required
{
"required": ["id", "email"], // name not here = optional
"properties": {
"id": {"type": "integer"},
"email": {"type": "string"},
"name": {"type": "string"} // present in properties but not required
}
}
// ✅ Union type with different shapes (discriminated union)
{
"oneOf": [
{
"type": "object",
"required": ["type", "email"],
"properties": {
"type": {"const": "email_user"},
"email": {"type": "string", "format": "email"}
}
},
{
"type": "object",
"required": ["type", "phone"],
"properties": {
"type": {"const": "phone_user"},
"phone": {"type": "string", "pattern": "^\\+[0-9]{10,15}$"}
}
}
]
}JSON Schema vs TypeScript vs Zod
| Item | JSON Schema | TypeScript / Zod |
|---|---|---|
| When validation runs | Runtime — validates actual data values | TypeScript: compile-time only (erased at runtime). Zod: runtime. |
| Language support | Language-agnostic — works in JS, Python, Java, Go, etc. | TypeScript: JS/TS only. Zod: JS/TS only. |
| Schema format | JSON document — human-readable, shareable across services | TypeScript code / Zod chain calls — tightly coupled to language |
| OpenAPI integration | ✅ Native — OpenAPI uses JSON Schema directly | ❌ Requires conversion tools (zod-to-openapi, etc.) |
| Error messages | Standard error objects with instancePath, keyword, message | Zod: excellent custom error messages. TypeScript: compile errors only. |
| Best use case | API contracts, cross-service data validation, config files | Zod: within a TypeScript codebase, form validation with type inference |
Use our JSON Schema Generator