Data-First, Scenario Driven Approach To System Design

system design
Kirill Latish
Kirill Latish
LinkedIn
Share

Explore a data-first, scenario-driven approach to system design that emphasizes clarity and effective communication among teams. Learn how to separate data observation from manipulation.

Why the Observation–Manipulation Split Is a Powerful Starting Point

When designing software systems, many teams jump directly into technologies, frameworks, or microservices.

But most systems become significantly easier to reason about if you start with one simple observation:

Any informational system can be described as two subsets of processes:

  • Data observation (reading information)

  • Data manipulation (changing state)

In other words, every system is fundamentally a read–write machine.

This perspective mirrors a long-standing design principle sometimes summarized as “separating asking from telling.”

  • Queries ask questions and return information.

  • Commands change the system state.

This same idea appears at the architectural level in CQRS (Command Query Responsibility Segregation), a pattern described by Martin Fowler. CQRS suggests separating models used for updating information from those used for reading it.

However, Fowler also warns that CQRS can introduce significant complexity and should not be used by default.

So the key insight is this:

The observation/manipulation split is extremely powerful as a design thinking tool, even if you never implement full CQRS infrastructure.

When used as a workflow for system design, it creates clarity across teams and helps prevent architectural confusion.

Mermaid diagram is empty


A Practical System Design Workflow

A practical workflow that works for many systems looks like this:

  1. Make the data explicit

  2. Make state changes explicit

  3. Validate architecture against real scenarios

These three steps form the foundation of a reliable system design process.


Step 1 — Data Model as the Shared Language

One of the most common sources of confusion in system design discussions is this:

Everyone talks about the same feature, but each person imagines a different data reality.

Domain modeling solves this problem by creating a shared vocabulary.

In Domain-Driven Design this is known as a Ubiquitous Language — a common language used by developers and domain experts that maps directly to the software model.

A useful way to begin is simple:

Start by listing the core entities and their fields.

This doesn't mean jumping straight into SQL implementation.

In professional data modeling, models usually evolve through three levels:

Level

Purpose

Conceptual

What things exist in the domain

Logical

Relationships between entities

Physical

Database implementation details

Starting with a rough tables-and-fields draft is often the fastest way for teams to converge.


Why “Tables First” Works (If Treated as a Draft)

Listing tables early forces concrete questions:

  • What are the real entities?

  • What identifies each object?

  • Which fields are optional vs required?

  • What changes over time?

  • What must never change?

This approach is consistent with the original motivation behind Entity-Relationship modeling, which introduced diagrams to represent real-world data semantics.

The goal is not to design the perfect schema immediately.

The goal is to create a shared mental model.


A Minimal Table Draft Template

When drafting entities, consistently capture a few key elements:

Identity

  • Primary key strategy (UUID, integer, natural key)

Lifecycle

  • created_at

  • updated_at

  • status or deleted_at

Ownership

  • user_id

  • org_id

  • workspace_id

Relationships

  • Foreign keys

  • Cardinality hints

Constraints

  • unique(email)

  • non-negative balances

  • required fields

Even basic database design relies on these concepts to maintain integrity between tables.


Normalize for Correctness, Then Optimize Reads

Normalization is not academic theory — it is a practical tool that:

  • prevents redundant data

  • avoids update anomalies

  • preserves data integrity

A pragmatic strategy used in many real systems is:

1. Normalize writes for correctness
2. Optimize reads later with projections, caches, or indexes

This sequencing naturally fits the observation/manipulation model.

You maintain correctness on the write path, and optimize performance on the read path when necessary.


Example Data Model

Below is a simplified first-draft schema for a learning platform similar to SkillHub.

text
users
- id (pk)
- email (unique)
- display_name
- password_hash
- created_at
- updated_at

skills
- id (pk)
- slug (unique)
- name
- created_at

user_skills
- user_id (fk -> users.id)
- skill_id (fk -> skills.id)
- level
- evidence_url
(pk: user_id, skill_id)

courses
- id (pk)
- owner_user_id (fk -> users.id)
- title
- description
- status (draft/published)
- created_at

lessons
- id (pk)
- course_id (fk)
- title
- content_type
- content_ref
- order_index

enrollments
- id (pk)
- user_id (fk)
- course_id (fk)
- progress_percent
- enrolled_at

assessments
- id (pk)
- course_id (fk)
- type (quiz/project)
- config_json

submissions
- id (pk)
- assessment_id (fk)
- user_id (fk)
- grade
- feedback

Even this incomplete draft enables discussion about:

  • identity

  • relationships

  • lifecycle states

  • invariants

  • auditability

This is exactly what early system design should enable.


Step 2 — Processes: CRUD as Scaffolding

After defining the data model, the next step is designing processes.

A practical first step is implementing CRUD APIs for the entities.

CRUD endpoints provide:

  • a concrete system surface

  • a basis for authentication and validation

  • early UI prototypes

  • integration testing hooks

But it's important to understand:

CRUD is scaffolding, not the finished architecture.

Many systems fail not because they lack CRUD operations, but because they fail to enforce business rules when actions interact.


Avoid the “Anemic Domain Model”

Martin Fowler describes a common anti-pattern called the Anemic Domain Model.

This occurs when domain objects become simple data containers and all business logic lives in services.

Instead, state changes should enforce rules.

Examples:

  • You cannot enroll in an unpublished course.

  • You cannot publish a course without lessons.

  • You cannot grade a nonexistent submission.

These rules belong to the manipulation layer of the system.

Commands should enforce invariants.


Mapping CRUD to HTTP Semantics

If your system exposes an HTTP API, aligning CRUD with HTTP semantics improves clarity.

Method

Purpose

GET

observation (safe)

POST

create

PUT

replace (idempotent)

PATCH

partial update

DELETE

remove

Using consistent semantics makes failure behavior predictable and reduces accidental complexity.


Treat APIs as Contracts

Once CRUD endpoints exist, treat the API as a contract, not an implementation detail.

Using an API specification like OpenAPI helps align teams on:

  • request and response formats

  • error shapes

  • pagination

  • filtering

  • versioning

A good workflow becomes:

text
Data model
CRUD endpoints
OpenAPI contract
Tests
UI prototypes

Step 3 — Scenario-Driven Architecture

Once CRUD scaffolding exists, architecture becomes real when you introduce scenarios.

Scenario-based design comes from architectural methods like ATAM (Architecture Tradeoff Analysis Method).

Instead of vague requirements, scenarios define stimulus and response.


Types of Architectural Scenarios

Useful scenario categories include:

Use case scenarios

Typical user interactions.

Growth scenarios

Future changes or scale.

Exploratory scenarios

Failure conditions and stress situations.

Many architectures appear correct until stress conditions appear.

Good architects think about these early.


A Simple Scenario Template

A practical scenario format:

text
Scenario name
Source
Environment
Stimulus
Artifact
Response
Response measure

Example:

text
Scenario: Enrollment peak

Environment: peak traffic
Stimulus: 5000 concurrent enroll requests
Artifact: enroll API
Response: successful enrollment
Measure: p95 latency < 300ms

This forces teams to define measurable success.


Example Scenario Set

Enrollment under load

  • 5000 concurrent enrollments

  • p95 < 300ms

  • error rate < 0.5%

Course publishing rule

  • publish course with no lessons

  • system rejects request

  • no partial writes

Search performance

  • user searches skills

  • results paginated

  • p95 < 200ms

Database replica failure

  • read replica unavailable

  • fallback to primary

  • degraded but functional behavior

These scenarios drive discussions about:

  • caching

  • indexing

  • retries

  • queues

  • consistency models


Step 4 — Interface Design

A common UI heuristic is:

Minimize click distance to target actions.

This is directionally correct but incomplete.

The famous “3-click rule” is actually a myth.

Usability research shows that what matters is not click count but interaction cost.


Minimize Interaction Cost

Better metrics include:

  • Time to complete task

  • Error rate

  • Cognitive load

Three useful UX laws developers can apply:

Fitts’s Law

Large, close targets are faster to hit.

Important actions should be easy to reach.

Hick’s Law

More choices increase decision time.

Reduce or structure options.

Recognition over Recall

Users should see options rather than remember them.


Practical UI Principles

Some actionable guidelines:

  • Put primary actions where attention already is.

  • Make dangerous actions deliberate.

  • Reduce choice overload with progressive disclosure.

  • Show system state clearly before asking users to act.

Observation flows should prioritize speed.

Manipulation flows should prioritize safety and validation.

Mermaid diagram is empty

A Practical System Design Starter Pack

A lightweight system design workflow can produce four artifacts that evolve together.


1. Data

  • conceptual entities

  • first-pass tables

  • constraints and invariants


2. Processes

  • CRUD endpoints

  • domain commands

  • API contract (OpenAPI)


3. Scenarios

  • use case scenarios

  • growth scenarios

  • failure scenarios

  • measurable response criteria


4. Interfaces

  • user journeys

  • interaction cost metrics

  • usability principles

Mermaid diagram is empty

Communicating Architecture Clearly

Finally, system architecture benefits from consistent diagrams.

One of the most practical approaches is the C4 Model, which organizes diagrams into four levels:

  1. Context

  2. Containers

  3. Components

  4. Code

This allows teams to communicate architecture consistently across roles.

Mermaid diagram is empty

Most systems are fundamentally read–write machines.

When you:

  • make the data explicit

  • make state changes explicit

  • test design against real scenarios

architecture becomes dramatically easier to reason about.

Combining:

  • a clear data model

  • a practical API contract

  • a realistic scenario catalog

  • thoughtful interface design

creates a repeatable, scalable method for designing systems under real delivery pressure.

And most importantly — it keeps architecture grounded in how systems actually behave, not how we hope they will.

Kirill Latish
Kirill Latish
LinkedIn
Share