Database Normalization

Relational Database Management Systems - RDBMS

When designing a relational database, there is a need to "normalize" the database.

There are different degrees of normalization, but in general, relational databases should be normalized to the "third normal form".

Simply put, this means that the attributes (i.e. fields) in each table should "depend on the key, the whole key and nothing but the key".

An example of a database table that is NOT normalized (i.e. de-normalized) is provided below.

In this example, the database designer has made the assumption that there will never be a need to store more than two order items on any one order:


Example of a De-Normalized Database Table

By moving repeating groups of attributes to a separate database table, the database design becomes more flexible. A single order can now support any number of order items for a single order; not just just two.

The primary key (PK) of the Order Item table is the "Order Nbr" (represented by a relationship) plus the "Order Item Nbr":


Example of a Database Tables in 2nd Normal Form


The "Order Item Description" field is dependent on the "Order Item Code"; not the unique identifier of the Order Item Table (i.e. "Order Nbr" + "Order Item Nbr").

By creating a "classification table", the database become even more flexible.  New codes can easily be added at any time.

The "Order Item Description" for a given code can also be easily altered should the need ever arise (e.g. "blue widget" => "light blue widget"):


Example of Database Tables in 3rd Normal Form


A RDBMS alone will not solve all data management issues. A good data analyst and/or database analyst is needed to design a flexible and efficient relational database.

Next Page