Second Normal Form
In simplest terms, for a database to be in Second Normal Form (2NF), the database must already be in 1NF (you must normalize in order), and every column in a table that is not a key has to relate only to the primary key. The most obvious indication that a database is not 2NF is if multiple records in a table might have the exact same value for a column. As an example, if you listed a producer along with each record album, this value could be repeated over the course of the album's table.
Looking at the accounting database (Figure 3.3), there are a number of problems. For starters, the client information will not necessarily be particular to any one invoice (a client could be billed several times). Second, the expense information is not tied to the invoices either.
To put this database into 2NF, I'll need to separate out these columns into their own tables, where each value will be represented only once. In fact, normalization could be summarized as the process of creating more and more tables until potential redundancies have been eliminated.
To make a database 2NF compliant:
- Identify any fields that do not relate directly to the primary
key.
As I stated above, all of the client information and expense information are not Second Normal Form invoice-particular.
- Create new tables accordingly (Figure 3.4).
The most logical modification for the existing structure is to make separate
Clients, Invoices, and Expenses tables. In my visual
representation of the database, I create a box for each table, with the table
name as a header and all of its columns (or attributes) underneath.
Figure 3.4 To normalize the database, I must move redundant informationsuch as the client and expense datato their own tables.
- Assign or create new primary keys (Figure
3.5).
Figure 3.5 Each table in the database should have its own primary key, whether it's a dummy field such as Client ID or a necessary one such as Invoice Number.
Using the techniques described earlier in the chapter, ensure that each new table has a primary key. Because both the Clients and Expenses tables do not have good unique identifiers, I'll create artificial ones: Client ID and Expense ID. Arguably, the Client Name field should be unique and therefore could be the primary key, but it's always best to use integers for this purpose.
- Repeat steps 13.
Since I've created new tables with new primary keys, I should double-check to see if there are any 2NF problems. In the example (Figure 3.5), there is one glaring issuethe Expense Category field may apply to multiple expenses. Therefore, I'll make a new Expense Categories table (Figure 3.6).
Figure 3.6 The Expense Category field, which was part of Expenses, should be its own table as well.
- Create the requisite foreign keys indicating the relationships
(Figure 3.7). The final step in achieving 2NF compliance is to
incorporate foreign keys and relationships to identify how all of the data and
tables are associated. Remember that a primary key in one table will most likely
be a foreign key in another. If you find that the primary key in one table is
not represented as a foreign key in another, you may have missed something (but
not necessarily).
Figure 3.7 For the new primary keys, I've added corresponding foreign keys and indicated the relationships (both one-to-many).