Skip to main content

Normalization of Relational Database and its types

A database stores data in an organized way so that it can be searched and retrieved later. It should contain one or more tables. A table is much like a spreadsheet, in that it's made up of rows and columns. All rows have the same columns, and each column contains the data itself. If it helps, think of your tables in the same way that you would a table in Excel.

Data can be inserted, retrieved, updated, and deleted from a table. The word, created, is generally used instead of inserted, so, collectively, these four functions are affectionately abbreviated as CRUD.



What is Relational Database?
A relational database is a type of database that organizes data into tables, and links them, based on defined relationships. These relationships enable you to retrieve and combine data from one or more tables with a single query. The relational database was invented by E. F. Codd at IBM in 1970.

What is Database Normalization?
Database normalization is the process of organizing the fields and tables of a relational database to minimize redundancy. Normalization usually involves dividing large tables into smaller (and less redundant) tables and defining relationships between them.

The main objective of Database Normalization is:
  • To eliminate reduntant(useless) data
  • To ensure data dependencies 


Edgar F. Codd, the inventor of the relational model, introduced the concept of normalization and what we now know as the First Normal Form (1NF) in 1970. Codd went on to define the Second Normal Form (2NF) and Third Normal Form (3NF) in 1971, and Codd and Raymond F. Boyce defined the Boyce-Codd Normal Form (BCNF) in 1974. Informally, a relational database table is often described as "normalized" if it is in the Third Normal Form. Most 3NF tables are free of insertion, update, and deletion anomalies.



First Normal Form (1NF):
Each field in a table contains different information. As per 1NF, no two rows of data must contain repeating group of information i.e each set of column must have a unique value, such that multiple columns cannot be used to fetch the same row.

For example:
In a school, school wants records of its all students and related subjects then table looks like below:

Student Id Student Name Address Subjects  Faculty Fee
1 Reshma Kathmandu Botany, Physics, Statistics Science 2000

In 1NF, each field should contain only one value. Like there shouldn't be more than one value like separated with commas. Rather that that, we must separate such data into multiple rows in a different table. So we separate subject from above table and create new table with Student Id as identifier of subjects.

Student Id Subjects
1 Botany
1 Physics
1 Statistics

Using 1NF, data redundancy increases but there each row will have a unique value.


Second Normal Form (2NF):
A table that is in first normal form (1NF) must meet additional criteria if it is to qualify for second normal form. Specifically: a table is in 2NF if and only if it is in 1NF and no non-prime attribute is dependent on any proper subset of any candidate key of the table. A non-prime attribute of a table is an attribute that is not a part of any candidate key of the table.

Simply, a table is in 2NF if and only if it is in 1NF and every non-prime attribute of the table is dependent on the whole of a candidate key.

Now continuing the above example:
The table following 2NF will be as below:

Table 1  (Student table)
Student Id Student Name Student Address
Faculty
Fee
1 Reshma Kathmandu Science 2000

Table 2  (Subject table)
Student Id Subjects
1 Botany
1 Physics
1 Statistics

Now, in table 1, Student Id is primary key and Student Name and Student Address should depend on its primary key.

Third Normal Form (3NF): 
A table which must be in 2NF and no duplicate information is permitted. It applies that every non-prime attribute of table must be dependent on primary key.

In the above example, fee depends in admission year more than student. So remove these values and create a new table like below:
Table 1  (Student table)
Student Id Student Name Student Address
1 Reshma Kathmandu

Table 2  (Subject table)
Student Id Subjects
1 Botany
1 Physics
1 Statistics

Table 3 (Faculty table)

Faculty
Fee
Science 2000

Tables should be normalized till 3NF, if not we will proceed further other levels of Normalization.

Boyce and Codd Normal Form (BCNF or 3.5NF):
Boyce and Codd Normal Form is a higher version of the Third Normal Form.  It is a slightly stronger version of the third normal form (3NF). BCNF was developed in 1974 by Raymond F. Boyce and Edgar F. Codd to address certain types of anomaly not dealt with by 3NF as originally defined.

A 3NF table which does not have multiple overlapping candidate keys is said to be in BCNF. 

Comments

Popular posts from this blog

MySQL Connection

Connection with MySQL Database Before accessing database, you must create a connection to the database Syntax: mysql_connect(servername,username,password); where, servername specifies the server to connect to. Default value is “localhost” username specifies the username to log in with. Default value is the name of the user that owns the server process. To connect offline we use username “root”. password specifies the password to log in with. Default is “” Code : Creating connection to the database and selecting the required database <?php $con = mysql_connect(“localhost”,”root”,”"); if (!$con) { die(‘Could not connect: ‘ . mysql_error()); } else{ mysql_select_db(“test”, $con) }; ?> Here, we have store connection in a variable called $con and trap error using die function. Closing connection The connection will be closed automatically when the script ends. To close the connection before, use the mysql_close() function: <?php $con = mysql_conne...

Join in MySQL

Join clause is used to combine rows from two or more tables depending upon common field between them. There are different kinds of Joins. The most common and useful Join is Inner Join. Inner Join: It will return all rows from multiple tables where the condition is met.

Div and span

Div: Div <div> tag is that it divides the HTML document into sections. Proper usage of the <div> tag is fundamental to good HTML coding, the <div> tag is one of the most powerful tools available to a web developer. Span: Span <span> is  in-line level elements, they only span across small amounts of content, typically words or phrases. For example: a <span> tag might be used to make a word red in color or to give it an underline.