Normalization In DBMS
In this article we are going to discuss about
- Normalization
- De Normalization
- First Normal Form (1NF)
- Second Normal Form (2NF)
- Third Normal Form (3NF)
- Fourth Normal Form (4NF)
- Boyce‐Codd Normal Form (BCNF)
- Need for, Pros & Cons
What is Normalization ?
Normalization is a technique by which we disintegrate large Tables in to smaller ones so that we avoid data redundancy (repetitions), data anomalies during insert update and keeping the data dependencies sensible.
What is De Normalization ?
De Normalization is the technique by which we make a single Large Table with all required columns from smaller tables, usually done for historical reporting and data analysis in a Data Ware house project.
De Normalized Large Table with data repetitions
Whats is First Normal Form (1NF) ?
A row of data can not contain repeating group of data. i.e. each column must have a unique value. Each row of data must have unique identifier. i.e. Primary Key.
Consider the below Student table which is not in 1 NF.
What is Second Normal Form (2NF) ?
A table to be 2NF it should meet all the needs of 1NF and there must not be any partial dependency of any column on PK. For a table that has concatenated PK, each column in the table that is not part of the PK must depend upon the entire concatenated key for its existence. If any column depends only on one part of the concatenated key then the table fails 2NF.
Consider the Customer Table which is in 1NF. Where the concatenation customer_id and customer_name is PK. It is not in 2NF because there are partial dependencies of columns on PK. Customer name is only dependent on customer_id and Oder_name is dependent on order_id, and there is no link between Sales_detail and Customer_name.
After breaking the Customer Table into 3 tables as below it follows the conditions of 2NF.
What is Third Normal Form (3NF) ?
For 3NF every non‐prime attribute of table must be dependent on primary key. The transitive functional dependency should be removed from the table. The table must be in second normal form 2NF.
Consider the Student_Detail Table in which the Sudent_id is primary key, but street, city and state dependent on zip. The dependency between zip and other field is transitive dependency. Hence to apply 3NF we have to move the street city and state to new table with ZIP as primary key.
What is Boyce‐Codd Normal Form ( BCNF ) ?
This deals with certain type of anomalies which is not handled by 3NF. A 3NF table which doesn’t have multiple overlapping candidate keys is said to be in BCNF.
Most of the 3NF relations are also can be BCNF.
A 3NF is not in BCNF if
- There are more than one candidate keys in the relation.
- Two Candidate keys in the relation are composite keys (they are not single attributes). And they are overlapping.
- Every determinant in the table is not a candidate key.
Consider the examples as follows:
• Student can join multiple courses • One teacher teaches one course
• Each student can take course from one teacher only • The Functional Dependencies are:
• Student # and Course ID: Teacher ID
• But, Teacher ID also determines the course • Teacher ID: Course ID
• So one need to split the table into Student Teacher and Course Teacher tables
What is Fourth Normal Form (4NF) ?
A relation is said to be in Fourth Normal Form if a relation is in Boyce‐Codd Normal Form and contains no non‐trivial multi‐valued dependencies.
By eliminating non‐trivial multi‐valued dependencies BCNF can be converted to 4NF.
Consider the following example:
One course can be taken by many students One course can have many papers
The multi‐valued dependencies are as follows : Course: Student
Course: Papers
Difference between 1NF,2NF,3NF,4NF, BCNF ?
Normalization Forms and Normalizing Process:
Difference between Normalization and DE Normalization ?
Normalized
- Smaller Tables
- Current Transactional
- Quick insert update
- Reports needs multiple joins will take time
- DB size less.
- OLTP
De‐Normalized
- Large Table
- Historical Data
- Slow insert update
- Quick report with less joins.
- Very large Databases.
- OLAP
Topic : Normalization In DBMS : 1NF, 2Nf, 3NF, BCNF , DE-Normalization