Database Design & Normalisation
Good schema design prevents bugs that are nearly impossible to fix later. A well-designed schema has no redundancy, enforces invariants at the database level, and makes queries natural to write.
Entity-Relationship (ER) Diagrams
Before writing SQL, model your domain with an ER diagram. Key elements:
- Entity: a thing with independent existence — User, Order, Product
- Attribute: a property of an entity — User.name, Order.amount
- Relationship: how entities relate — User places Orders
Cardinality
| Relationship | Meaning | Example |
|---|---|---|
| One-to-One (1:1) | Each row maps to at most one in the other table | User ↔ Profile |
| One-to-Many (1:N) | One row in A relates to many rows in B | User → Orders |
| Many-to-Many (M:N) | Many in A relate to many in B | Students ↔ Courses |
Converting ER to Tables
One-to-Many: put the foreign key on the "many" side.
Many-to-Many: create a junction table (associative entity).
One-to-One: foreign key on either side, with a UNIQUE constraint.
Functional Dependencies
A functional dependency A → B means: knowing A uniquely determines B. This is the basis of normalisation.
First Normal Form (1NF)
Rule: every column holds an atomic value — no lists, arrays, or nested structures.
Second Normal Form (2NF)
Rule: 1NF + no partial dependencies — every non-key attribute depends on the whole primary key.
Only relevant when the primary key is composite.
Third Normal Form (3NF)
Rule: 2NF + no transitive dependencies — non-key attributes depend only on the primary key, not on other non-key attributes.
Boyce-Codd Normal Form (BCNF)
Stricter than 3NF: for every functional dependency A → B, A must be a superkey (uniquely identifies a row). Handles edge cases in tables with multiple candidate keys.
When to Denormalise
Normalisation removes redundancy but can require many joins for common queries, hurting read performance. Denormalisation deliberately re-introduces redundancy for speed:
Rules for denormalisation:
- Only after measuring that joins are actually slow
- Add write-side code to keep denormalised data in sync
- Document why the redundancy exists
Naming Conventions
Consistent naming makes a schema self-documenting:
| Convention | Example | |
|---|---|---|
| Tables | Plural, snake_case | user_orders |
| Primary key | id | users.id |
| Foreign key | {table}_id | orders.user_id |
| Timestamps | created_at, updated_at | — |
| Boolean | is_ or has_ prefix | is_active, has_verified_email |
Key Takeaways
- Model your domain in ER diagrams before writing SQL — entities, attributes, and relationships
- 1:N relationships → foreign key on the "many" side; M:N → junction table
- 1NF: atomic values. 2NF: no partial deps. 3NF: no transitive deps.
- Normalise first; denormalise only when you've measured a real performance problem
- Foreign key constraints enforce referential integrity at the DB level — don't skip them