Open-Source Internship opportunity by OpenGenus for programmers. Apply now.
In this article, we have explored Normalization in Database Management System (DBMS) in depth along with different normal forms like first normal form, second normal form, boyce codd normal form and others.
Table of contents:
- What is Normalization?
- Levels of Normalization
- First Normal form
- Second Normal form
- Third Normal form
- Boyce-Codd Normal form
- Fourth, Fifth Normal form
Let us get started with Normalization in DBMS.
What is Normalization?
Normalization is the process of organizing data in Database. It refers to modifying the relations to have :
- Minimum Redundancy : Redundancy is having multiple copies of data.
- Minimum update anomalies : Reduced issues in updation in database.
- No loss of information
- Maintaining functional dependency : Functional dependency is when an attribute is dependent on another attribute.
- Minimal NULL values
Levels of Normalization
Levels are based on amount of redundancy in database.
The different Levels of Normalization are:
- First Normal form
- Second Normal form
- Third Normal form
- Boyce-Codd Normal form
- Fourth, Fifth Normal form
First Normal form
If a table does not consists multivalued attributes, it is in first normal form.
For example, Books and Authors database:
ID | Title | AuName | AuPhone | Price |
---|---|---|---|---|
000001 | Balloon | Jones | 9888888, 999999, 888888,444444 | 1000 |
000002 | Uleysis | Smith, Jones | 8888777,5555333 | 2000 |
000003 | Basic | Joycee | 233322 | 3000 |
000004 | Biasis | Prim | 665555 | 5000 |
000005 | Contests of earth | Will,Prick | 113300,990004, 455544 | 7000 |
Here, 'AuName' and 'AuPhone' are multivalued. So it is not in first normal form yet.
Decomposition in first normal form :
ID | Title | AuPhone | Price |
---|---|---|---|
000001 | Balloon | 9888888 | 1000 |
000001 | Balloon | 999999 | 1000 |
000001 | Balloon | 888888 | 1000 |
000001 | Balloon | 444444 | 1000 |
000002 | Uleysis | 8888777 | 2000 |
000002 | Uleysis | 5555333 | 2000 |
000003 | Basic | 233322 | 3000 |
000004 | Biasis | 665555 | 5000 |
000005 | Contests of earth | 113300 | 7000 |
000005 | Contests of earth | 990004 | 7000 |
000005 | Contests of earth | 455544 | 7000 |
ID | Title | AuName | Price |
---|---|---|---|
000001 | Balloon | Jones | 1000 |
000002 | Uleysis | Smith | 2000 |
000002 | Uleysis | Jones | 2000 |
000003 | Basic | Joycee | 3000 |
000004 | Biasis | Prim | 5000 |
000005 | Contests of earth | Will | 7000 |
000005 | Contests of earth | Prick | 7000 |
Second Normal form
If every non-key attribute in table is only dependent on primary key, then the tables are in second normal form.
For example, Consider the same database :
Scheme -> {Title, PubId, AuId, Price, AuAddress}
- Key -> {Title, PubId, AuId}
- {Title, PubId, AuId} -> {Price}
- {AuId} -> {AuAdress}
It is not in second normal form as :
- AuAddress does not belong to a key
- AuAddress functionally depends on AuId which is a subset of a key.
Decomposition into second normal form :
Old scheme -> {Title, PubId, AuId, Price, AuAddress}
New scheme -> {Title, PubId, AuId, Price}
New scheme -> {AuId, AuAddress}
Third Normal form
When all non-key attributes do not have any inter-dependencies among non-key attributes only, it is called third normal form.
For example, Consider the same database :
Scheme -> {Title, PubId, PageCount, Price,}
- Key -> {Title, PubId}
- {Title, PubId} -> {PageCount}
- {PageCount} -> {Price}
- Both Price and PageCount depend on a key
- But Price transitively depends on Title and PubId, so it is not in third normal form.
Decomposition into third normal form :
Old scheme -> {Title, PubId, PageCount, Price}
New scheme -> {Title, PubId, PageCount, Price}
New scheme -> {Title, PubId, PageCount}
Boyce-Codd Normal form
If every Determinant is a candidate key, it is in BCNF.
In third normal form, for A->B functional dependency, it allows dependency in the relation if B is a Primary key and A is not a candidate key.
But in BCNF, for functional dependency, A must be a candidate key.
Every relation in BCNF is in third normal form but every relation in third normal form is not in BCNF.
For example, In a Consulting relation :
Scheme -> {Client, Problem, Consultant}
- Key1 -> {Client, Problem}
- Key2 -> {Client, Consultant}
- There is no non-key attribtute, so it is in third normal form.
- {Client, Problem} -> {Consultant}
- {Client, Consultant} -> {Problem}
- There is dependency between attributes belonging to keys which violates BCNF.
Conversion to BCNF :
Old scheme -> {Client, Problem, Consultant}
New scheme -> {Client, Consultant}
New scheme -> {Client, Problem}
Fourth, Fifth Normal form
When it does not contain any Multivalued dependencies.
Multivalued dependencies : For a relation among A,B,C for each value of A, a set of values of B and C is present. But B and C are independent of each other.
Fourth Normal form eliminates independent many-to-one relationships between columns.
For example,
Scheme -> {Manager, Child, Employee}
- Primary Key -> {Manager, Child, Employee}
- Each manager can have more than one child.
- Each manager can supervise more than on employee.
- Fourth normal form is violated.
Conversion to Fourth Normal form :
Old scheme -> {Manager, Child, Employee}
New scheme -> {Manager, Child}
New scheme -> {Manager, Employee}
Manager | Child | Employee |
---|---|---|
Jim | Beth | Alice |
Mary | Bob | Jones |
Mary | NULL | Adam |
Manager | Child |
---|---|
Jim | Beth |
Mary | Bob |
Manager | Employee |
---|---|
Jim | Alice |
Mary | Jane |
Mary | Adam |
In Fifth Normal form, there should be no Join dependencies. Rest all is same as fourth normal form.
With this article at OpenGenus, you must have the complete idea of Normalization in DBMS.