Database normalization
This article is an appendix for Eight common bad practices in Database Design, in which I’d like to talk about database normalization step in database design.
What is database normalization?
Database normalization, or simply normalization, is the process of organizing the columns (attributes) and tables (relations) of a relational database to reduce data redundancy and improve data integrity. Normalization is also the process of simplifying the design of a database so that it achieves the optimal structure composed of atomic elements. It was first proposed by Edgar F. Codd, as an integral part of a relational model. Normalization involves arranging attributes in relations based on dependencies between attributes, ensuring that the dependencies are properly enforced by database integrity constraints. Normalization is accomplished by applying some formal rules either by a process of synthesis or decomposition. Synthesis creates a normalized database design based on a known set of dependencies. Decomposition takes an existing (insufficiently normalized) database design and improves it based on the known set of dependencies.
(Source Wikipedia)
In my point of view, database normalization is the step you optimize your tables base on their usages in production.
Why does it become so important?
Let’s take a look at following table
CREATE TABLE `users` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`email` varchar(255) NOT NULL DEFAULT '',
`password` varchar(32) NOT NULL DEFAULT '',
`firstname` varchar(100) DEFAULT NULL,
`lastname` varchar(100) DEFAULT NULL,
`phone` varchar(20) DEFAULT NULL,
`gender` tinyint(1) DEFAULT NULL,
`language` varchar(2) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `email` (`email`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Now, we will think how often we show user’s profile information, for instance, phone, and gender. It could be seen that rarely they will be shown. However, we need to verify each time he logs into our system, hence, id
, email
, password
usually use. Therefore, it requires to restructure your table into 2 smaller tables. This action is one of database normalization
methods.
There are numerous methods to archieve database normalization
:
UNF — Unnormalized form
1NF — First normal form
2NF — Second normal form
3NF — Third normal form
EKNF — Elementary key normal form
BCNF — Boyce–Codd normal form
4NF — Fourth normal form
ETNF — Essential tuple normal form
5NF — Fifth normal form
6NF — Sixth normal form
DKNF — Domain/key normal form
(Source Wikipedia)
In this post, I won’t explain in details about each method. Nevertheless, I will spend time to write about some methods which I like much in next post.
Originally published at dotronglong.com on October 13, 2017.