Different Normal Forms / Normalization in DBMS

Internship at OpenGenus

Get FREE domain for 1st year and build your brand new site

In this article, we have explored Normalization in Database Management System (DBMS) in depth along with different normal forms like first normal form, second normal form, boyce codd normal form and others.

Table of contents:

  1. What is Normalization?
  2. Levels of Normalization
  3. First Normal form
  4. Second Normal form
  5. Third Normal form
  6. Boyce-Codd Normal form
  7. Fourth, Fifth Normal form

Let us get started with Normalization in DBMS.

What is Normalization?

Normalization is the process of organizing data in Database. It refers to modifying the relations to have :

  • Minimum Redundancy : Redundancy is having multiple copies of data.
  • Minimum update anomalies : Reduced issues in updation in database.
  • No loss of information
  • Maintaining functional dependency : Functional dependency is when an attribute is dependent on another attribute.
  • Minimal NULL values

Levels of Normalization

Levels are based on amount of redundancy in database.

The different Levels of Normalization are:

  • First Normal form
  • Second Normal form
  • Third Normal form
  • Boyce-Codd Normal form
  • Fourth, Fifth Normal form

First Normal form

If a table does not consists multivalued attributes, it is in first normal form.
For example, Books and Authors database:

ID Title AuName AuPhone Price
000001 Balloon Jones 9888888, 999999, 888888,444444 1000
000002 Uleysis Smith, Jones 8888777,5555333 2000
000003 Basic Joycee 233322 3000
000004 Biasis Prim 665555 5000
000005 Contests of earth Will,Prick 113300,990004, 455544 7000

Here, 'AuName' and 'AuPhone' are multivalued. So it is not in first normal form yet.

Decomposition in first normal form :

ID Title AuPhone Price
000001 Balloon 9888888 1000
000001 Balloon 999999 1000
000001 Balloon 888888 1000
000001 Balloon 444444 1000
000002 Uleysis 8888777 2000
000002 Uleysis 5555333 2000
000003 Basic 233322 3000
000004 Biasis 665555 5000
000005 Contests of earth 113300 7000
000005 Contests of earth 990004 7000
000005 Contests of earth 455544 7000
ID Title AuName Price
000001 Balloon Jones 1000
000002 Uleysis Smith 2000
000002 Uleysis Jones 2000
000003 Basic Joycee 3000
000004 Biasis Prim 5000
000005 Contests of earth Will 7000
000005 Contests of earth Prick 7000

Second Normal form

If every non-key attribute in table is only dependent on primary key, then the tables are in second normal form.

For example, Consider the same database :
Scheme -> {Title, PubId, AuId, Price, AuAddress}

  1. Key -> {Title, PubId, AuId}
  2. {Title, PubId, AuId} -> {Price}
  3. {AuId} -> {AuAdress}

It is not in second normal form as :

  • AuAddress does not belong to a key
  • AuAddress functionally depends on AuId which is a subset of a key.

Decomposition into second normal form :
Old scheme -> {Title, PubId, AuId, Price, AuAddress}
New scheme -> {Title, PubId, AuId, Price}
New scheme -> {AuId, AuAddress}

Third Normal form

When all non-key attributes do not have any inter-dependencies among non-key attributes only, it is called third normal form.

For example, Consider the same database :
Scheme -> {Title, PubId, PageCount, Price,}

  1. Key -> {Title, PubId}
  2. {Title, PubId} -> {PageCount}
  3. {PageCount} -> {Price}
  • Both Price and PageCount depend on a key
  • But Price transitively depends on Title and PubId, so it is not in third normal form.

Decomposition into third normal form :
Old scheme -> {Title, PubId, PageCount, Price}
New scheme -> {Title, PubId, PageCount, Price}
New scheme -> {Title, PubId, PageCount}

Boyce-Codd Normal form

If every Determinant is a candidate key, it is in BCNF.
In third normal form, for A->B functional dependency, it allows dependency in the relation if B is a Primary key and A is not a candidate key.

But in BCNF, for functional dependency, A must be a candidate key.

Every relation in BCNF is in third normal form but every relation in third normal form is not in BCNF.
For example, In a Consulting relation :
Scheme -> {Client, Problem, Consultant}

  1. Key1 -> {Client, Problem}
  2. Key2 -> {Client, Consultant}
  • There is no non-key attribtute, so it is in third normal form.
  1. {Client, Problem} -> {Consultant}
  2. {Client, Consultant} -> {Problem}
  • There is dependency between attributes belonging to keys which violates BCNF.

Conversion to BCNF :
Old scheme -> {Client, Problem, Consultant}
New scheme -> {Client, Consultant}
New scheme -> {Client, Problem}

Fourth, Fifth Normal form

When it does not contain any Multivalued dependencies.
Multivalued dependencies : For a relation among A,B,C for each value of A, a set of values of B and C is present. But B and C are independent of each other.

Fourth Normal form eliminates independent many-to-one relationships between columns.

For example,
Scheme -> {Manager, Child, Employee}

  1. Primary Key -> {Manager, Child, Employee}
  • Each manager can have more than one child.
  • Each manager can supervise more than on employee.
  • Fourth normal form is violated.

Conversion to Fourth Normal form :
Old scheme -> {Manager, Child, Employee}
New scheme -> {Manager, Child}
New scheme -> {Manager, Employee}

Manager Child Employee
Jim Beth Alice
Mary Bob Jones
Mary NULL Adam
Manager Child
Jim Beth
Mary Bob
Manager Employee
Jim Alice
Mary Jane
Mary Adam

In Fifth Normal form, there should be no Join dependencies. Rest all is same as fourth normal form.

With this article at OpenGenus, you must have the complete idea of Normalization in DBMS.