JSON Schema Validation for Tokens

Part of Token Scaling, Validation & CI Pipelines. This page covers the sub-problem of treating token files as typed contracts enforced at commit time — defining canonical JSON schemas, running them inside CI gates, and integrating the results with CSS linting so structural anomalies never reach downstream consumers.

Without schema enforcement, design systems degrade into unstructured data lakes where deprecated aliases, malformed hex values, and missing semantic metadata propagate silently into production builds. Schema validation acts as the first line of defense, catching structural anomalies at the source rather than during stylesheet compilation or component rendering. This is especially critical when a schema also validates client override files in white-label token workflows — a strict additionalProperties: false boundary is the only reliable guarantee that locked tokens remain locked.

Schema Validation Gating a Token PR A left-to-right flow showing a token pull request entering the CI pipeline, passing through JSON Schema validation with AJV, then either being rejected with an error report or proceeding to merge and downstream CSS compilation. Token PR tokens/**/*.json changed CI Runner on: pull_request paths filter JSON Schema Validation AJV strict mode type + pattern check additionalProperties cross-ref aliases PR BLOCKED error report in log FAIL PASS Merge & Compile CSS variables out
Schema validation gates a token pull request. Failed validation blocks the PR immediately with a structured error report; passing tokens proceed to merge and downstream CSS compilation.

Problem Framing

The failure mode is not hypothetical: a Figma export script introduces numeric coercion on spacing values, converting "16px" to the integer 16. The token file is syntactically valid JSON. It passes a surface-level diff review. It lands in main. Three hours later, the CSS compiler emits calc(16 * 1px) instead of 16px, and every component that uses that token renders incorrectly on Safari because of a unit-stripping quirk. No test caught it because the test was asserting on component output, not on token source.

Schema validation with Ajv in strict mode catches this at the commit boundary. The same mechanism catches missing metadata fields that break documentation generation, naming-convention violations that misalign with the CSS variable naming system, and incomplete alias resolution that silently produces undefined references in the compiled stylesheet. These are all cheap to prevent at commit time and expensive to debug after deployment.

Schema Architecture & Type Definitions

A production-ready token schema uses JSON Schema Draft 2020-12 to enforce strict typing, pattern matching, and conditional validation rules. Core definitions cover color values, spacing scales, typography ramps, and semantic aliases. Structuring schemas hierarchically via $ref lets teams isolate validation concerns and enable modular token consumption across web, mobile, and native platforms.

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "additionalProperties": false,
  "properties": {
    "color": { "$ref": "#/$defs/colorToken" },
    "spacing": { "$ref": "#/$defs/spacingToken" }
  },
  "$defs": {
    "colorToken": {
      "type": "object",
      "patternProperties": {
        "^[a-z]+(?:-[a-z0-9]+)*$": {
          "type": "object",
          "properties": {
            "value": {
              "type": "string",
              "pattern": "^#([0-9a-fA-F]{3}|[0-9a-fA-F]{6}|[0-9a-fA-F]{8})$"
            },
            "metadata": {
              "type": "object",
              "properties": {
                "description": { "type": "string" },
                "category": { "enum": ["background", "text", "border"] },
                "platform": { "enum": ["web", "ios", "android"] }
              },
              "required": ["description", "category", "platform"]
            }
          },
          "required": ["value", "metadata"]
        }
      }
    },
    "spacingToken": {
      "type": "object",
      "patternProperties": {
        "^[a-z]+(?:-[a-z0-9]+)*$": {
          "type": "object",
          "properties": {
            "value": { "type": "string", "pattern": "^\\d+(\\.\\d+)?(px|rem|em)$" },
            "type": { "const": "dimension" }
          },
          "required": ["value", "type"]
        }
      }
    }
  }
}

Hierarchical composition via $ref enables isolated schema updates without invalidating the entire token graph. The architectural trade-off lies in strictness versus developer velocity: enforcing additionalProperties: false guarantees zero schema drift but requires explicit migration paths when introducing new token categories. Regex patterns for naming conventions standardize kebab-case across platforms, while enum constraints on metadata fields prevent ambiguous categorization.

Three-Tier Architectural Trade-Offs

These trade-offs recur on every token team that graduates beyond simple CI lint checks. The right call is context-dependent — record your decisions in an ADR.

  • Schema strictness vs authoring speed. additionalProperties: false blocks any undocumented token key, which is excellent for stability but creates friction when designers add experimental tokens mid-sprint. Counter with a designated experimental/ namespace that relaxes this constraint only for tokens prefixed with exp-.

  • Single-schema vs per-category schemas. One monolithic schema is simpler to maintain but becomes the merge conflict epicenter for large teams. Splitting into per-category schemas (color.schema.json, spacing.schema.json) allows parallel ownership but requires an aggregation step that validates cross-category $ref links.

  • Compile-time validation vs runtime fallback. Strict compile-time gates prevent malformed tokens from ever reaching production but break the pipeline for every contributor, including those on unrelated tasks. Runtime fallback chains (CSS var(--token, fallback)) add resilience but hide schema violations until browser rendering. Choose one as canonical; using both simultaneously creates ambiguity about which is authoritative.

  • Regex naming patterns vs explicit allowlists. patternProperties regex enforces naming conventions structurally, but a mismatch is error-prone to debug. An explicit enum allowlist for token names is more readable and produces clearer CI errors, at the cost of requiring a schema update for every new token.

  • AJV strict mode vs lenient mode. Strict mode (strict: true) errors on unknown keywords and disallows unevaluatedProperties pitfalls — it finds more bugs, but it also rejects schemas that use draft-07 features in a draft 2020-12 validator. Enable strict mode from day one; retrofitting it onto an existing schema is a multi-hour migration.

  • Pre-compiled vs on-demand schema compilation. Compiling the AJV schema to a JavaScript module at build time reduces per-run validation from ~3s to under 500ms for repositories exceeding 10,000 tokens, but adds a schema pre-compilation step to the release process. Worthwhile above ~5,000 tokens; unnecessary below.

Validation Pipeline Implementation

Automated validation executes as a pre-merge gate in CI. The pipeline ingests raw token JSON, validates against the master schema using Ajv, and generates structured error reports. Failed validations block pull requests and surface precise line-level diagnostics. This workflow integrates seamlessly with design-to-code sync automation to prevent schema drift between Figma exports and repository state.

name: Token Validation Pipeline
on:
  pull_request:
    paths: ['tokens/**/*.json']
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: '22', cache: 'npm' }
      - run: npm ci
      - run: npx ajv validate -s schema/tokens.schema.json -d "tokens/**/*.json" --strict=true --all-errors

The pipeline stages follow a deterministic sequence: token extraction and normalization, schema compilation and type checking, cross-reference validation against Stylelint rules, error aggregation and report generation, and finally PR gating with merge approval routing. Using Ajv in strict mode eliminates silent coercion bugs. The primary architectural trade-off involves compilation overhead versus validation speed; pre-compiling the schema into a standalone JavaScript module reduces CI execution time from ~3s to under 500ms for repositories exceeding 10,000 tokens.

For a deeper treatment of the step-by-step implementation — including alias resolution, Ajv version pinning, and SARIF output for GitHub Advanced Security — see validating design tokens against JSON Schema in CI.

Tool table

Tool Purpose Integration point
AJV v8+ (ajv/dist/2020) JSON Schema Draft 2020-12 validation with full error paths Pre-merge CI step, pre-commit hook
ajv-cli CLI wrapper for AJV enabling shell-invokable schema checks YAML run: step in GitHub Actions
ajv-formats URI, date, and regex format validation AJV plugin; required for format: "uri" on asset tokens
Stylelint + stylelint-value-no-unknown-custom-properties CSS-side enforcement of validated token names Post-schema lint step
lint-staged Pre-commit hook runner scoping validation to changed files only .lintstagedrc.json

Linting & Style Enforcement Integration

Beyond structural validation, token schemas must align with CSS-specific linting rules. Teams should configure schema-aware linters that cross-reference token values against Stylelint plugin configuration to catch unit mismatches, deprecated aliases, and accessibility violations. Combining JSON schema checks with Stylelint creates a dual-layer enforcement mechanism that guarantees both data integrity and CSS compliance.

// stylelint.config.js
module.exports = {
  plugins: ['stylelint-value-no-unknown-custom-properties'],
  rules: {
    'custom-property-pattern': '^--[a-z][a-z0-9-]+$',
    'value-no-unknown-custom-properties': [true, {
      ignoreProperties: ['/^--legacy-/']
    }]
  }
};

Framework-agnostic token generation pipelines must bridge JSON validation with CSS variable output. A common pattern involves generating a .stylelintrc.json dynamically from validated token metadata, mapping semantic categories to custom-property-pattern rules. For example, spacing tokens validated as px or rem can be cross-checked against a baseline scale defined in the schema. The architectural trade-off here centers on coupling: tightly binding schema validation to CSS linting accelerates feedback but increases pipeline complexity. Decoupling them allows independent iteration but risks temporary misalignment between token definitions and stylesheet consumption.

Advanced CI Integration & Reporting

For enterprise-scale token repositories, validation pipelines require parallel execution, caching strategies, and machine-readable output formats.

jobs:
  validate-parallel:
    strategy:
      matrix:
        category: [color, spacing, typography, motion]
    runs-on: ubuntu-latest
    steps:
      - uses: actions/cache@v4
        with:
          path: .ajv-cache
          key: ${{ runner.os }}-ajv-schema-${{ hashFiles('schema/*.json') }}
      - run: npx ajv validate -s schema/${{ matrix.category }}.schema.json -d "tokens/${{ matrix.category }}/**/*.json"

Parallel matrix execution isolates validation domains, reducing total pipeline duration by ~60%. Caching compiled Ajv schemas via CI artifacts prevents redundant AST parsing across workflow runs. Output formats like SARIF integrate natively with GitHub Advanced Security, while JUnit XML feeds into legacy test aggregators. The performance target of under 5 seconds for 10,000 tokens is achievable through schema pre-compilation and strict memory limits. However, aggressive parallelization introduces eventual consistency risks if cross-token references (e.g., semantic aliases pointing to base values) are validated in isolation. Mitigation requires a final sequential aggregation step that resolves $ref dependencies before merging reports.

Cross-Cluster Dependency Mapping

Schema validation does not operate in isolation — it sits at the confluence of design export, CI, and multi-brand compilation. The table below maps the concrete integration points.

Section Sibling topic Integration point Validation strategy
Token Scaling, Validation & CI Design-to-Code Sync Workflows Figma export produces the raw JSON that schema validation ingests; export conflicts surface as schema errors Schema runs immediately after export step; mis-keyed Figma tokens fail the naming patternProperties regex
Token Scaling, Validation & CI Stylelint Plugin Configuration Validated token names are the allowlist for stylelint-value-no-unknown-custom-properties Generated .stylelintrc from schema pass; CSS lint runs in the same CI job after token validation
Token Scaling, Validation & CI Versioning & Semantic Release for Tokens Schema changes trigger a semver bump; breaking schema changes (removing required fields) force a major version Semantic release checks for schema diff on every commit; a $defs removal is treated as a breaking change
Multi-Brand & White-Label White-Label Token Overrides The same AJV schema validates client override files; additionalProperties: false is the locked-token guard Override CI runs the token schema before the contrast audit; any locked-token key causes immediate exit 1
/* @depends: /token-scaling-validation-ci-pipelines/design-to-code-sync-workflows/ */
/* @depends: /multi-brand-theming-white-label-token-architecture/white-label-token-overrides/ */

/* tokens/compiled/semantic.css — output of schema-validated token build */
:root {
  /* color tokens: validated against #([0-9a-fA-F]{6}) pattern */
  --ds-color-action-primary: #2563eb;
  --ds-color-surface-default: #f8fafc;

  /* spacing tokens: validated against \\d+(\\.\\d+)?(px|rem|em) pattern */
  --ds-spacing-base: 4px;
  --ds-spacing-md: 16px;
}

Production Code Reference

Pre-compiled AJV validator for performance

For repositories with more than 5,000 tokens, pre-compiling the schema to a standalone module eliminates repeated schema parsing on every CI run.

// scripts/compile-schema.js — run once during repo setup or schema changes
const Ajv = require('ajv/dist/2020');
const standalone = require('ajv/dist/standalone');
const fs = require('fs');

const ajv = new Ajv({ code: { source: true }, allErrors: true });
const schema = JSON.parse(fs.readFileSync('./schema/tokens.schema.json', 'utf8'));
const validate = ajv.compile(schema);
const moduleCode = standalone(ajv, validate);
fs.writeFileSync('./schema/validate-tokens.js', moduleCode);
console.log('Schema pre-compiled to schema/validate-tokens.js');
// scripts/validate-tokens-fast.js — used in CI instead of ajv-cli
const validate = require('../schema/validate-tokens.js');
const fs = require('fs');
const { glob } = require('glob');

const tokenFiles = glob.sync('tokens/**/*.json');
let hasErrors = false;

for (const file of tokenFiles) {
  const data = JSON.parse(fs.readFileSync(file, 'utf8'));
  if (!validate(data)) {
    console.error(`\n[FAIL] ${file}`);
    for (const err of validate.errors) {
      console.error(`  ${err.instancePath}${err.message}`);
    }
    hasErrors = true;
  }
}

process.exit(hasErrors ? 1 : 0);

Why this works: the pre-compiled module is a plain JavaScript function with no schema parsing at runtime. Startup cost drops from ~800ms to ~40ms for large schemas, and the validator can be require()d directly in other scripts (e.g., the contrast audit) without re-compiling.

Diagnostic Matrix

Diagnostic step Execution detail
Token PR blocked with no error detail Add --all-errors flag to the AJV CLI call. Without it, AJV stops at the first error. Run npx ajv validate -s schema/tokens.schema.json -d tokens/color.json --strict=true --all-errors locally.
Validation passes locally but fails in CI Check Node.js version parity. AJV v8 requires Node 16+. Pin node-version: '22' in actions/setup-node and match your local nvmrc. Also verify that npm ci is used (not npm install) so lockfile is respected.
additionalProperties error on a token you just added The schema’s additionalProperties: false blocks every key not declared in properties or patternProperties. Add the new category to the schema’s properties map and create a matching $defs entry before committing the token.
Hex color fails pattern check despite looking correct Likely a leading space, uppercase #FF0000 matching fine, or an 8-digit hex (#RRGGBBAA) not covered by a 6-digit-only pattern. Update the regex to ^#([0-9a-fA-F]{3}|[0-9a-fA-F]{6}|[0-9a-fA-F]{8})$ and re-run.
Alias token fails validation because it references {color.base.blue} The schema validates literal values, not alias syntax from the W3C Design Tokens format. Run an alias-resolution pre-processor (e.g., Style Dictionary’s resolveReferences) before passing the file to AJV, or extend the schema to accept the alias pattern via oneOf.
Schema compilation step fails with unevaluatedProperties error You are using a draft 2020-12 keyword in a schema loaded with new Ajv() (draft-07 default). Import from ajv/dist/2020 instead of ajv for draft 2020-12 schemas.
CI takes 30+ seconds for 2,000 tokens Schema is being compiled from JSON on every run. Pre-compile to a standalone JS module as shown in the production code reference above. Target: under 5s for 10,000 tokens.

Common root causes and resolutions

Numeric coercion from Figma export scripts. Spacing and line-height values arrive as JSON numbers (16) rather than strings ("16px"). The schema’s string type + regex catches this immediately. Fix the export script to always serialize dimension values with units as quoted strings. Never accept number type for CSS dimension tokens.

Missing required metadata fields on new tokens. A developer adds a token manually without the description, category, and platform fields. The schema’s required array on the token object catches this. The resolution is to add a Yeoman or plop generator that scaffolds new tokens with all required fields pre-filled.

patternProperties regex does not match camelCase tokens imported from a third-party library. The pattern ^[a-z]+(?:-[a-z0-9]+)*$ is kebab-case only. Either convert imported tokens to kebab-case in the normalization step, or add a secondary patternProperties entry for camelCase and gate it to a specific namespace (e.g., vendor/).

Cross-category $ref fails when schemas are split per category. A spacing token references a color alias that lives in color.schema.json. AJV cannot resolve $ref across files unless you explicitly load all schemas with ajv.addSchema(). Load all category schemas before compiling the root schema.

Frequently Asked Questions

Should the JSON Schema live in the same repository as the tokens?

Yes, for the common case. Co-locating schema and tokens makes version control atomic: a token change and its schema update land in the same PR, reviewed together. The only exception is a multi-team monorepo where a platform team owns the schema and product teams own tokens — in that case, publish the schema as a versioned npm package and pin it in each token repository’s package.json.

How do you handle the W3C Design Tokens Community Group format (.json with $value / $type)?

The W3C format uses $value as the value key and $type for the token category. The schema above uses value and type. These are two different conventions. If your toolchain targets the W3C format, update the schema’s $defs to match: replace "value" with "$value" in required and properties, and add "$type" as an enum. Do not mix conventions in the same file — the naming inconsistency causes silent merge bugs in alias resolution.

Can the same schema validate both base tokens and client override files?

Partially. The base token schema and the client override schema share structural patterns (hex color regex, kebab-case naming) but diverge in coverage: the base schema allows every token; the override schema allows only the open subset. Extract the shared $defs (color pattern, dimension pattern) into a common schema file and $ref them from both schemas. Do not use the base schema directly to validate overrides — it would accept locked tokens.