Planet For Application Life Development Presents
MY IT World

Explore and uptodate your technology skills...

Database Normalization

Normalization is the process of organizing data in a database. This includes creating tables and establishing relationships between those tables according to rules designed both to protect the data and to make the database more flexible by eliminating redundancy and inconsistent dependency.

Redundant data wastes disk space and creates maintenance problems. If data that exists in more than one place must be changed, the data must be changed in exactly the same way in all locations. A customer address change is much easier to implement if that data is stored only in the Customers table and nowhere else in the database.

Type of Normalization

First Normal Form (1NF)

  • The table cells must be of single value.
  • Eliminate repeating groups in individual tables.
  • Create a separate table for each set of related data.
  • Identify each set of related data with a primary key.

Definition: An entity is in the first normal form if it contains no repeating groups. In relational terms, a table is in the first normal form if it contains no repeating columns. Repeating columns make your data less flexible, waste disk space, and make it more difficult to search for data.

Second Normal Form (2NF)

  • Remove Partial Dependencies.
  • Functional Dependency: The value of one attribute in a table is determined entirely by the value of another.
  • Partial Dependency: A type of functional dependency where an attribute is functionally dependent on only part of the primary key (primary key must be a composite key).

Definition: A relation is in 2NF if it is in 1NF and every non-key attribute is fully dependent on each candidate key of the relation.

Third Normal Form (3NF)

  • Remove transitive dependencies.
  • Transitive Dependency A type of functional dependency where an attribute is functionally dependent on an attribute other than the primary key. Thus its value is only indirectly determined by the primary key.
  • Create a separate table containing the attribute and the fields that are functionally dependent on it. Tables created at this step will usually contain descriptions of either resources or agents. Keep a copy of the key attribute in the original file.

Definition:A relation is in third normal form, if it is in 2NF and every non-key attribute of the relation is non-transitively dependent on each candidate key of the relation.

Boyce-Codd Normal Form (BCNF)

  • When a relation has more than one candidate key, anomalies may result even though the relation is in 3NF.
  • 3NF does not deal satisfactorily with the case of a relation with overlapping candidate keys
  • i.e. composite candidate keys with at least one attribute in common.
  • BCNF is based on the concept of a determinant.
  • A determinant is any attribute (simple or composite) on which some other attribute is fully functionally dependent.
  • A relation is in BCNF is, and only if, every determinant is a candidate key.

Definition: A relation is in Boyce-Codd Normal Form (BCNF) if every determinant is a candidate key. (See the links in the box at right for definitions of determinant and candidate key.)

The difference between 3NF and BCNF is that for a functional dependency A  B, 3NF allows this dependency in a relation if B is a primary-key attribute and A is not a candidate key,
 
Whereas BCNF insists that for this dependency to remain in a relation, A must be a candidate key.

Fourth Normal Form (4th NF)

  • Has no multiple sets of multi-valued dependencies. In other words, 4NF states that no entity can have more than a single one-to-many relationship within an entity if the one-to-many attributes are independent of each other.
  • Fourth Normal Form applies to situations involving many-to-many relationships.

Definition: A table is in fourth normal form (4NF) if and only if it is in BCNF and contains no more than one multi-valued dependency.

Fifth Normal Form (5th NF)

  • A relation that has a join dependency cannot be decomposed by a projection into other relations without spurious results
  • A relation is in 5NF when its information content cannot be reconstructed from several smaller relations i.e. from relations having fewer attributes than the original relation.

Definition: A table is in fifth normal form (5NF) or Project-Join Normal Form (PJNF) if it is in 4NF and it cannot have a lossless decomposition into any number of smaller tables.

Fifth normal form, also known as join-projection normal form (JPNF), states that no non-trivial join dependencies exist. 5NF states that any fact should be able to be reconstructed without any anomalous results in any case, regardless of the number of tables being joined. A 5NF table should have only candidate keys and it's primary key should consist of only a single column.