Enterprise Design System, Governed at Scale

Executive Summary

The problem in one line

Four years of parallel product growth had produced six products, four visual languages, three icon sets, and no single component that looked the same in every context. Every sprint started with designers re-solving problems that had already been solved — differently — three months earlier.

The answer in one line

A token-first design system with a governance model that treated every product team as both a consumer and a potential contributor — not a dependency to manage but a co-owner with a stake in the system's quality.

Three phases, one thread

Phase 1 · Months 1–6

Audit + Foundation

Token architecture, core component library, Figma + code in sync

Phase 2 · Months 7–18

Adoption + Governance

Team onboarding, contribution model, versioning policy, roadmap

Phase 3 · Month 19+

Scale + Autonomy

11 contributing teams, automated token sync, 94% adoption

The Problem

Four years of parallel growth, zero shared language

The company had scaled from one product to six without a unified design foundation. Each product team hired its own designers, made its own component decisions, and built its own Figma library — or didn't. By the time I was brought in to lead design systems, the situation was concrete and measurable.

The audit findings · before

Distinct button variants in production

34

Unique colour values in CSS

212

Separate Figma component libraries

9

Icon sets, partially overlapping

3

Avg. hrs/sprint re-solving solved problems

11h

UI bug tickets per month (baseline)

84

Design-to-handoff time for a new screen

4.8d

Products that shared even 1 component

0

The real cost

212 colour values means no designer can hold the palette in their head. 34 button variants means no engineer trusts the design file. 9 separate libraries means every new designer onboards into a different system. The cost wasn't aesthetic — it was velocity, trust, and the ability to ship consistently at scale.

The problem wasn't that the teams built badly. It was that they built in isolation, so every good decision stayed local.

Stakeholder Map

Influence × Interest · design system programme

Influence

Low interest

High interest

Keep satisfied

C-Suite / VP Eng

Velocity + cost-reduction ROI

Brand / Marketing

Visual consistency, brand compliance

Manage closely

Head of Product

Roadmap + adoption mandate

Lead Engineers (×6)

Token integration, component build

Design Leads (×6 products)

Primary consumers + contributors

Monitor

QA / Accessibility

WCAG compliance in component states

Procurement / Ops

Tooling licences, Figma seats

Keep informed

Individual Designers

Day-to-day library consumers

Front-end Engineers

Token + component implementors

New Hires

Primary documentation audience

← Low influence

High influence →

System Scope

Deciding what goes in — and what doesn't

A design system that tries to contain everything becomes a bureaucracy. The scope decision was the first governance act: what gets centralised, what stays local, and how do you tell the difference?

In the system

Design tokens (color, spacing, type, motion)

Primitive components (Button, Input, Badge, Modal)

Icon library (single set, 380 icons)

Typography scale + font loading

Accessibility patterns (focus, ARIA, contrast)

Motion / animation tokens

Shared but optional

Composite patterns (Card, Table, Nav shell)

Data visualisation primitives

Form layout patterns

Loading + skeleton states

Toast / notification patterns

Stays local

Product-specific domain patterns

Bespoke dashboard layouts

One-off illustration / marketing UI

A/B test variants

Experimental / unproven patterns

The decision rule

A pattern enters the system when it appears in 3+ products, when it is stable enough to version, and when a team is willing to own its maintenance. One appearance in one product is a local pattern. Three appearances across three products is a system candidate.

Token Architecture

Tokens first. Components second. Everything else follows.

The system's stability comes from its foundation. Before a single component was built, the token architecture was agreed, named, and documented. Tokens are the contract between design and engineering — the one place where a change propagates everywhere without anyone having to touch a file.

Three-tier token architecture

Tier 1

Primitive tokens

Raw values with no semantic meaning. Never used directly in components.

/* colour */

--blue-500: #4d7cff;

--blue-600: #3a63e0;

/* spacing */

--space-4: 4px;

--space-8: 8px;

--space-16: 16px;

Tier 2

Semantic tokens

Intent-based aliases. These are what designers and engineers actually use.

/* maps to primitives */

--color-action: var(--blue-500);

--color-action-hover: var(--blue-600);

/* spacing semantic */

--spacing-sm: var(--space-8);

--spacing-md: var(--space-16);

--spacing-lg: var(--space-24);

Tier 3

Component tokens

Component-scoped tokens. Enable local overrides without breaking the chain.

/* button-scoped */

--btn-bg: var(--color-action);

--btn-bg-hover: var(--color-action-hover);

--btn-radius: var(--radius-md);

--btn-padding: var(--spacing-sm) var(--spacing-md);

Why this matters

When product A needs a dark-mode variant or product B needs a brand-colour override, they change Tier 3 tokens — not source CSS. The primitive and semantic layers stay untouched. One Figma variable collection, one JSON token file, one npm package. Change once, propagate everywhere.

Semantic colour tokens · core set

Token

Value

Usage

--color-action

#4D7CFF

CTAs, links, active states

--color-success

#3FE08A

Completion, positive feedback

--color-warning

#FF8A3D

Caution, non-blocking alerts

--color-error

#F24F4F

Destructive actions, validation errors

--color-text-primary

#F3F5FB

Primary body text

--color-text-secondary

#BCC3D8

Supporting labels, captions

--color-surface

#0C1020

Card, modal, panel backgrounds

Component Library

Built to be consumed, not admired

Every component in the library was built against a checklist before it shipped: documented in Storybook, accessible to WCAG 2.1 AA, covered by visual regression tests, and available in both Figma and npm. A component that only exists in one place is a promise waiting to be broken.

Component anatomy · Button

Primary

Secondary

Ghost

Destructive

Disabled

Token anatomy

Background → --btn-bg

Label → --btn-color, --type-label-md

Border → --btn-border

Padding → --btn-padding-y / -x

Radius → --btn-radius

Focus ring → --focus-ring, --focus-offset

Ship checklist · every component

✓ All states documented (default, hover, focus, disabled, error)

✓ WCAG 2.1 AA contrast verified in all states

✓ Keyboard navigation tested and documented

✓ ARIA roles, labels, and announcements specified

✓ Figma variants match React component API

✓ Visual regression tests in CI pipeline

✓ Usage examples in Storybook

✓ "When to use / when not to use" in docs

Component inventory · 42 shipped

Primitives · 18

Button · Input · Select · Checkbox · Radio · Toggle · Slider · Badge · Tag · Avatar · Spinner · Icon · Tooltip · Popover · Divider · Link · Label · Fieldset

Composite · 12

Card · Modal · Drawer · Toast · Alert · Banner · Tabs · Accordion · Breadcrumb · Pagination · Table · Form

Navigation · 6

Sidebar · Top nav · Mega menu · Command palette · Search · Stepper

Data display · 6

Stat card · Progress bar · Skeleton · Empty state · Error state · Data table

38 stable (v1+)

3 beta (v0.x)

1 deprecated (migration guide published)

Governance Model

The governance is the product

Most design systems die not because the components were bad, but because no one decided who owns what when something needs to change. Governance is the answer to that question — made explicit, before the conflict arrives.

The Design System Team (DST)

Owns the source of truth. Makes the hard calls.

3 people: design system lead (me), 1 senior engineer, 1 designer. Not a committee — a team with a roadmap, sprint rituals, and an on-call rotation for breaking-change questions. The DST makes final decisions on what goes into the system and when. It does not make those decisions in isolation.

Responsibilities: token spec · component API · versioning · documentation · deprecation

The System Working Group (SWG)

One rep per product team. The feedback loop that prevents ivory-tower design.

Monthly sync: each team's design lead brings blockers, proposals, and adoption friction. The SWG surfaces real use cases that the DST can't see from the centre. It's advisory, not a voting body — but the DST is accountable to act on SWG feedback or explain in writing why it didn't.

Responsibilities: surface gaps · propose contributions · flag breaking changes · drive adoption

Decision matrix · who decides what

Decision type

DST

SWG

Product team

Add a new component to the system

✓ Owns

Proposes

—

Change a token value (non-breaking)

✓ Owns

Informed

—

Breaking change to component API

✓ Owns

✓ Must consult

Informed 60d prior

Override a token locally for one product

—

Notified

✓ Owns

Build a local pattern not yet in the system

Logs it

—

✓ Owns

Deprecate a component

✓ Owns

✓ Must consult

Informed 90d prior

A governance model without enforcement is a suggestion. The DST published decisions in writing. The SWG could challenge them. That accountability loop is what made the system trustworthy.

Component Library Architecture

Three-tier hierarchy: Primitives → Composites → Domain patterns

The component library is organized by composition complexity, not by use case. This structure forces clear boundaries and prevents "god components" that try to do everything. Every component owns exactly one responsibility.

This three-tier model is intentional. It forces conversations: "Is this a primitive or a composite?" — because the answer determines whether the DST builds it or the product team proposes it. It also defines when a pattern graduates from local to system: only patterns that appear in 3+ products and prove stable over 2+ quarters enter Tier 2 or 3.

Contribution Workflow

How teams propose, build, and ship components back to the system

The contribution model is how a product team becomes a co-owner rather than a passive consumer. It takes 6–10 weeks from proposal to ship, but every week is gated to prevent low-quality contributions that would create maintenance debt for the DST.

Decision Matrix

Who owns what: explicit responsibility for every decision type

The decision matrix is the governance model made operational. It removes ambiguity: every decision type has a clear owner, and every owner is accountable. This table is published, discussed, and updated annually with the SWG.

Contribution Framework

How product teams give back without breaking the system

By month 12, the DST had a backlog problem: product teams were building patterns the system should own, but there was no path for those patterns to come back in. The contribution framework solved that — without turning the DST into a gatekeeping bottleneck.

Contribution pipeline · 5 stages

1

Proposal (GitHub issue, standard template)

Product team fills a 6-field issue: what the pattern does, where it's used today, which other products might use it, proposed API, accessibility considerations, who will build it. No proposal, no contribution.

SLA
5d review

2

DST triage (Accept / Defer / Reject + reason)

DST reviews within 5 business days. Accept means it goes on the system roadmap. Defer means the team can build locally and we'll revisit in 2 cycles. Reject means it's out of scope — with a written reason. Silence is not an option.

Async
written

3

Co-build (contributing team + DST engineer, paired)

The product team builds the component. A DST engineer is assigned as reviewer — not to do the work, but to catch API decisions that would create maintenance debt. Pair sessions are scheduled, not ad-hoc.

1–3
sprints

4

Ship checklist review (DST gate)

Before merging, the DST runs the standard 8-point ship checklist. Not a subjective design review — an objective verification against the agreed bar. If it passes, it ships. If it doesn't, the contributing team gets specific, written feedback with a re-review date.

5d max
review

5

Publish + credit (changelog + team recognition)

The component ships with the contributing team named in the changelog. Their design lead presents it at the next SWG sync. This is not ceremony — recognition is adoption incentive. Teams contribute more when they're publicly credited for the work.

Ships
in next tag

Adoption Metrics

Adoption is the only metric that matters

A design system with low adoption is a style guide. Adoption was tracked quarterly — not as a vanity metric, but as a leading indicator of how well the system was serving the teams using it.

Component adoption · % of production UI using system components

Q1 · M3

12%

Foundation shipped

Q2 · M6

31%

2 teams onboard

Q3 · M9

54%

SWG launched

Q4 · M12

72%

Contributions open

Q6 · M18

87%

Token sync auto

Q8 · M24

94%

Current

Per-product adoption · month 24

Product A · Core web app

98%

First adopter · 2 contributors

Product B · Mobile web

96%

Token override for brand colour

Product C · Internal tool

95%

Data table contributed back

Product D · Dashboard

91%

Complex data viz local (under review)

Product E · Onboarding

89%

Late adopter · joined M14

Product F · New (M20)

78%

Started on system from day 1

Design-to-handoff time

1.5d

Was 4.8d · 3.2x faster

UI bug tickets/month

27

Was 84 · 68% reduction

Hours/sprint saved (per team)

8.4h

Across 6 teams = 50h/sprint recovered

Versioning Policy

Semver is a social contract, not just a number

The versioning policy is where governance becomes operational. Every team building on the system needs to know: if I upgrade, what will break? If I don't, how long until I'm unsupported? These aren't technical questions — they're trust questions.

Patch · v1.0.x

Bug fixes, no API change

Visual bug fix, typo in docs, accessibility fix that doesn't change the API. Safe to auto-upgrade. No changelog entry required beyond a one-liner. Teams never need to pin against patch releases.

Upgrade expectation: automatic via Renovate bot

Minor · v1.x.0

New features, backward-compatible

New component, new prop on existing component, new token. Nothing removed. Nothing renamed. Teams can upgrade on their own sprint cycle — there is no emergency. Changelog entry required with usage examples for any new component.

Upgrade expectation: within 2 sprints of release

Major · vX.0.0

Breaking changes — rare and deliberate

Prop renamed or removed, token renamed, component removed, API restructured. Teams get 60 days notice minimum. A migration guide ships with the release — not a link to GitHub, a written step-by-step. DST provides a migration sprint for teams that ask for it.

Upgrade expectation: within 1 quarter, DST-supported

Major releases since v1.0

2 (v2.0, v3.0)

Teams that needed DST migration support

3 of 6 (for v2.0)

Longest version behind any team

1 minor (current policy: max 2)

Breaking changes reversed due to team feedback

1 (SWG objection honoured)

Documentation Strategy

Documentation is the product's UX

The system has two audiences with opposite needs. Designers need to know when to use a component. Engineers need to know how to implement it. The same document serves neither well. We built two documentation surfaces with one source of truth underneath.

Figma · Designer-facing

Decisions, patterns, and "when to use"

Component canvas with all states visible at a glance

Usage guidance embedded as Figma annotations

Do / Don't examples for every primitive

Spacing and layout grids baked into frames

Variable collections named to match code tokens

Storybook · Engineer-facing

Props, states, code, and copy-paste examples

Every component story with all prop permutations

Copy-ready code snippets for common use cases

Accessibility annotations: ARIA roles, keyboard map

Visual regression screenshots auto-updated in CI

Token reference panel: which semantic token drives what

The principle behind both

Documentation that requires someone to ask a question has already failed. If a designer has to ping the DST to know whether to use a Card or a Panel, the documentation is incomplete. Every doc page ends with "still unsure? here's who to ask and where to file a gap." The bar is zero unforced clarification questions.

Key Tradeoffs

The decisions that were actually hard

Central authority vs. federated ownership

Hybrid model

Full central authority means teams lose autonomy and the DST becomes a bottleneck. Full federation means 9 libraries re-emerge. The DST owns the foundation; product teams own local extensions. The contribution framework is the bridge between the two.

Risk: local extensions diverge and never make it back. Mitigated by logging all local patterns in a shared "pattern debt" register reviewed quarterly.

Enforce adoption vs. let teams opt in

Incentive-first

Mandating adoption without reducing friction just generates resentment and workarounds. We chose to earn adoption by making the system faster to use than building locally — and only after that was true did adoption reporting go to VP level. The carrot, not the stick, got us to 94%.

Risk: one team that refused to adopt on principle. Resolved: their VP made the call after seeing the time data. Design team on-boarded within 6 weeks.

Build components or tokens first?

Tokens first

Shipping components first is tempting because it's visible. Tokens first is unglamorous but right: every component built before the token architecture is a component that will need to be rebuilt. We spent 6 weeks on tokens before a single component shipped. Teams complained; they stopped complaining in month 8.

Risk: "nothing to show" for the first 6 weeks. Managed by weekly stakeholder updates with written rationale, not just status.

Migrate existing UI or only govern new work?

Migrate on touch

A full migration sprint would have taken 4+ months and disrupted every team. Governing only new work meant legacy debt would persist indefinitely. "Migrate on touch" — every screen a team edits uses system components — hit 87% coverage in 18 months without a single dedicated migration sprint.

Risk: rarely-touched screens stay on legacy forever. Tracked in a "tech debt heatmap" — 6% of screens that make up the remaining adoption gap.

What I'd Do Next

Where this goes from here

Near term · 0–3 months

Close the 6% adoption gap

The remaining 6% is concentrated in 3 legacy screens that are rarely touched and expensive to migrate. Run a dedicated "legacy lift" sprint — offer DST pairing to the 2 product teams responsible. This gets us to 99%+, which matters because the last 1% is where brand inconsistency most often shows up to external users.

Near term · 0–3 months

Automated token drift detection

Teams that override tokens locally are supposed to log it. Some don't. A CI check that diffs production CSS against the token spec would catch silent drift — flagging when `color: #4d7cff` appears hardcoded instead of via `var(--color-action)`. This is a 2-sprint engineering investment that makes governance self-enforcing.

Medium term · 3–9 months

Design system health dashboard

Adoption metrics are currently pulled manually every quarter. A live dashboard showing per-product adoption %, open contribution proposals, SWG action items, and token drift incidents would make governance state visible to everyone — reducing the DST's reporting burden and letting teams self-monitor.

Medium term · 3–9 months

Multi-brand token layer

Two new products in the roadmap serve a different brand. The three-tier token architecture was built for this — but it hasn't been stress-tested against a full brand swap. A multi-brand token layer, where Tier 1 primitives stay constant and Tier 2 semantic tokens are swapped by brand, is the next architectural challenge the governance model will need to solve.

Future-State AI Vision

AI in a design system isn't a feature. It's infrastructure.

Semantic search

"Find me a component for confirming destructive actions"

Today, finding the right component requires knowing its name. An LLM-powered search across Storybook and Figma documentation would let designers and engineers describe what they need in plain language and get the right component, the relevant token, and the usage guidance in one result.

Drift detection + auto-fix

CI flags a hardcoded colour. AI suggests the correct token and opens a PR.

Token drift detection is already on the roadmap. The AI layer would take it further: not just flag the issue but identify the correct semantic token from context, generate the replacement code, and open a draft PR for the engineer to review. Governance becomes self-healing.

Contribution co-pilot

The GitHub proposal template becomes a guided conversation

The current 6-field contribution proposal is still a blank form. An AI co-pilot would walk the contributing team through the proposal, check for existing components that might already solve the need, flag accessibility considerations relevant to the pattern type, and generate a first-draft API spec for DST review.

Usage pattern analysis

The system learns which components are misused most

Analytics on component usage in production (via React DevTools profiler or custom telemetry) would reveal which props are never used, which components are combined in ways the documentation didn't anticipate, and which components correlate with high rates of accessibility issues. That data shapes the next version of every component and every piece of documentation.

Documentation generation

First-draft docs from the component code itself

The most common documentation gap is components that ship without usage examples. An LLM trained on the system's existing documentation style could generate a first-draft "when to use / when not to use" section from the component's props, variant names, and code comments — reducing the DST's documentation debt without reducing quality.

The design principle for AI in design systems

AI in a design system should reduce the cost of compliance, not replace the judgment that decides what to comply with. Token decisions, component API decisions, and governance decisions require human designers and engineers who understand the products the system serves. AI accelerates the mechanical work around those decisions — search, drift detection, documentation, proposal review — so the humans doing the actual design work spend more time on the decisions that only they can make.

A design system is only as strong as its governance

Executive Summary

Four years of parallel growth, zero shared language

Stakeholder Map

Deciding what goes in — and what doesn't

Tokens first. Components second. Everything else follows.

Built to be consumed, not admired

The governance is the product

Three-tier hierarchy: Primitives → Composites → Domain patterns

How teams propose, build, and ship components back to the system

Who owns what: explicit responsibility for every decision type

How product teams give back without breaking the system

Adoption is the only metric that matters

Semver is a social contract, not just a number

Documentation is the product's UX

The decisions that were actually hard

Where this goes from here

AI in a design system isn't a feature. It's infrastructure.

Inclusive health insurance — from jargon to self-service for Tier 2/3 India

DBS Schenker — Linehaul Planning Accelerator redesign

Building something where inconsistency gets expensive?