What is normalization in databases?
Why Interviewers Ask This
Interviewers use this question to quickly assess whether a candidate has the foundational knowledge required for MySQL / SQL development. It reveals whether you understand the building blocks that more complex concepts rely on.
Answer
Normalization is the process of organizing a relational database to reduce data redundancy and improve data integrity by decomposing tables into smaller, well-structured ones. It follows a series of Normal Forms (NF): 1NF (First Normal Form): each column contains atomic (indivisible) values; no repeating groups; each row is unique (has a primary key). No comma-separated lists or arrays in cells. 2NF: in 1NF AND every non-key attribute is fully functionally dependent on the entire primary key (no partial dependencies — relevant when the PK is composite). 3NF: in 2NF AND no transitive dependencies — non-key attributes must depend only on the primary key, not on other non-key attributes. Example violation: if a table stores employee_id, department_id, department_name — department_name depends on department_id (non-key), not directly on employee_id. Fix: move department_name to a separate departments table. BCNF (Boyce-Codd NF): stricter version of 3NF. Beyond 3NF: 4NF, 5NF exist but are rarely applied in practice. Normalization trades write efficiency for storage efficiency; denormalization trades redundancy for read performance.
Pro Tip
Demonstrate both theoretical understanding and practical experience. Say what it is, then give an example of how you actually used it in a MySQL / SQL codebase.