Data-First, Scenario Driven Approach To System Design

Explore a data-first, scenario-driven approach to system design that emphasizes clarity and effective communication among teams. Learn how to separate data observation from manipulation.
Why the Observation–Manipulation Split Is a Powerful Starting Point
When designing software systems, many teams jump directly into technologies, frameworks, or microservices.
But most systems become significantly easier to reason about if you start with one simple observation:
Any informational system can be described as two subsets of processes:
Data observation (reading information)
Data manipulation (changing state)
In other words, every system is fundamentally a read–write machine.
This perspective mirrors a long-standing design principle sometimes summarized as “separating asking from telling.”
Queries ask questions and return information.
Commands change the system state.
This same idea appears at the architectural level in CQRS (Command Query Responsibility Segregation), a pattern described by Martin Fowler. CQRS suggests separating models used for updating information from those used for reading it.
However, Fowler also warns that CQRS can introduce significant complexity and should not be used by default.
So the key insight is this:
The observation/manipulation split is extremely powerful as a design thinking tool, even if you never implement full CQRS infrastructure.
When used as a workflow for system design, it creates clarity across teams and helps prevent architectural confusion.
A Practical System Design Workflow
A practical workflow that works for many systems looks like this:
Make the data explicit
Make state changes explicit
Validate architecture against real scenarios
These three steps form the foundation of a reliable system design process.
Step 1 — Data Model as the Shared Language
One of the most common sources of confusion in system design discussions is this:
Everyone talks about the same feature, but each person imagines a different data reality.
Domain modeling solves this problem by creating a shared vocabulary.
In Domain-Driven Design this is known as a Ubiquitous Language — a common language used by developers and domain experts that maps directly to the software model.
A useful way to begin is simple:
Start by listing the core entities and their fields.
This doesn't mean jumping straight into SQL implementation.
In professional data modeling, models usually evolve through three levels:
Level | Purpose |
|---|---|
Conceptual | What things exist in the domain |
Logical | Relationships between entities |
Physical | Database implementation details |
Starting with a rough tables-and-fields draft is often the fastest way for teams to converge.
Why “Tables First” Works (If Treated as a Draft)
Listing tables early forces concrete questions:
What are the real entities?
What identifies each object?
Which fields are optional vs required?
What changes over time?
What must never change?
This approach is consistent with the original motivation behind Entity-Relationship modeling, which introduced diagrams to represent real-world data semantics.
The goal is not to design the perfect schema immediately.
The goal is to create a shared mental model.
A Minimal Table Draft Template
When drafting entities, consistently capture a few key elements:
Identity
Primary key strategy (UUID, integer, natural key)
Lifecycle
created_atupdated_atstatusordeleted_at
Ownership
user_idorg_idworkspace_id
Relationships
Foreign keys
Cardinality hints
Constraints
unique(email)non-negative balances
required fields
Even basic database design relies on these concepts to maintain integrity between tables.
Normalize for Correctness, Then Optimize Reads
Normalization is not academic theory — it is a practical tool that:
prevents redundant data
avoids update anomalies
preserves data integrity
A pragmatic strategy used in many real systems is:
1. Normalize writes for correctness
2. Optimize reads later with projections, caches, or indexes
This sequencing naturally fits the observation/manipulation model.
You maintain correctness on the write path, and optimize performance on the read path when necessary.
Example Data Model
Below is a simplified first-draft schema for a learning platform similar to SkillHub.
users
- id (pk)
- email (unique)
- display_name
- password_hash
- created_at
- updated_at
skills
- id (pk)
- slug (unique)
- name
- created_at
user_skills
- user_id (fk -> users.id)
- skill_id (fk -> skills.id)
- level
- evidence_url
(pk: user_id, skill_id)
courses
- id (pk)
- owner_user_id (fk -> users.id)
- title
- description
- status (draft/published)
- created_at
lessons
- id (pk)
- course_id (fk)
- title
- content_type
- content_ref
- order_index
enrollments
- id (pk)
- user_id (fk)
- course_id (fk)
- progress_percent
- enrolled_at
assessments
- id (pk)
- course_id (fk)
- type (quiz/project)
- config_json
submissions
- id (pk)
- assessment_id (fk)
- user_id (fk)
- grade
- feedbackEven this incomplete draft enables discussion about:
identity
relationships
lifecycle states
invariants
auditability
This is exactly what early system design should enable.
Step 2 — Processes: CRUD as Scaffolding
After defining the data model, the next step is designing processes.
A practical first step is implementing CRUD APIs for the entities.
CRUD endpoints provide:
a concrete system surface
a basis for authentication and validation
early UI prototypes
integration testing hooks
But it's important to understand:
CRUD is scaffolding, not the finished architecture.
Many systems fail not because they lack CRUD operations, but because they fail to enforce business rules when actions interact.
Avoid the “Anemic Domain Model”
Martin Fowler describes a common anti-pattern called the Anemic Domain Model.
This occurs when domain objects become simple data containers and all business logic lives in services.
Instead, state changes should enforce rules.
Examples:
You cannot enroll in an unpublished course.
You cannot publish a course without lessons.
You cannot grade a nonexistent submission.
These rules belong to the manipulation layer of the system.
Commands should enforce invariants.
Mapping CRUD to HTTP Semantics
If your system exposes an HTTP API, aligning CRUD with HTTP semantics improves clarity.
Method | Purpose |
|---|---|
GET | observation (safe) |
POST | create |
PUT | replace (idempotent) |
PATCH | partial update |
DELETE | remove |
Using consistent semantics makes failure behavior predictable and reduces accidental complexity.
Treat APIs as Contracts
Once CRUD endpoints exist, treat the API as a contract, not an implementation detail.
Using an API specification like OpenAPI helps align teams on:
request and response formats
error shapes
pagination
filtering
versioning
A good workflow becomes:
Data model
↓
CRUD endpoints
↓
OpenAPI contract
↓
Tests
↓
UI prototypesStep 3 — Scenario-Driven Architecture
Once CRUD scaffolding exists, architecture becomes real when you introduce scenarios.
Scenario-based design comes from architectural methods like ATAM (Architecture Tradeoff Analysis Method).
Instead of vague requirements, scenarios define stimulus and response.
Types of Architectural Scenarios
Useful scenario categories include:
Use case scenarios
Typical user interactions.
Growth scenarios
Future changes or scale.
Exploratory scenarios
Failure conditions and stress situations.
Many architectures appear correct until stress conditions appear.
Good architects think about these early.
A Simple Scenario Template
A practical scenario format:
Scenario name
Source
Environment
Stimulus
Artifact
Response
Response measureExample:
Scenario: Enrollment peak
Environment: peak traffic
Stimulus: 5000 concurrent enroll requests
Artifact: enroll API
Response: successful enrollment
Measure: p95 latency < 300msThis forces teams to define measurable success.
Example Scenario Set
Enrollment under load
5000 concurrent enrollments
p95 < 300ms
error rate < 0.5%
Course publishing rule
publish course with no lessons
system rejects request
no partial writes
Search performance
user searches skills
results paginated
p95 < 200ms
Database replica failure
read replica unavailable
fallback to primary
degraded but functional behavior
These scenarios drive discussions about:
caching
indexing
retries
queues
consistency models
Step 4 — Interface Design
A common UI heuristic is:
Minimize click distance to target actions.
This is directionally correct but incomplete.
The famous “3-click rule” is actually a myth.
Usability research shows that what matters is not click count but interaction cost.
Minimize Interaction Cost
Better metrics include:
Time to complete task
Error rate
Cognitive load
Three useful UX laws developers can apply:
Fitts’s Law
Large, close targets are faster to hit.
Important actions should be easy to reach.
Hick’s Law
More choices increase decision time.
Reduce or structure options.
Recognition over Recall
Users should see options rather than remember them.
Practical UI Principles
Some actionable guidelines:
Put primary actions where attention already is.
Make dangerous actions deliberate.
Reduce choice overload with progressive disclosure.
Show system state clearly before asking users to act.
Observation flows should prioritize speed.
Manipulation flows should prioritize safety and validation.
A Practical System Design Starter Pack
A lightweight system design workflow can produce four artifacts that evolve together.
1. Data
conceptual entities
first-pass tables
constraints and invariants
2. Processes
CRUD endpoints
domain commands
API contract (OpenAPI)
3. Scenarios
use case scenarios
growth scenarios
failure scenarios
measurable response criteria
4. Interfaces
user journeys
interaction cost metrics
usability principles
Communicating Architecture Clearly
Finally, system architecture benefits from consistent diagrams.
One of the most practical approaches is the C4 Model, which organizes diagrams into four levels:
Context
Containers
Components
Code
This allows teams to communicate architecture consistently across roles.
Most systems are fundamentally read–write machines.
When you:
make the data explicit
make state changes explicit
test design against real scenarios
architecture becomes dramatically easier to reason about.
Combining:
a clear data model
a practical API contract
a realistic scenario catalog
thoughtful interface design
creates a repeatable, scalable method for designing systems under real delivery pressure.
And most importantly — it keeps architecture grounded in how systems actually behave, not how we hope they will.
