Creating New Tables
The previous section described how to create and drop databases. Now let's move down one level in the PostgreSQL storage hierarchy and talk about creating and dropping tables.
You've created some simple tables in the first two chapters; it's time to talk about some of the more advanced features of the CREATE TABLE command. Here is the command that you used to create the customers table:
CREATE TABLE customers ( customer_id INTEGER UNIQUE, customer_name VARCHAR(50), phone CHAR(8), birth_date DATE, balance DECIMAL(7,2) );
This command creates a permanent table named customers. A table name must meet the naming criteria described earlier in this chapter. When you create a table, PostgreSQL automatically creates a new data type4 with the same name as the table. This means that you can't create a table whose name is the same as an existing data type.
When you execute this command, the customers table is created in the database that you are connected to. If you are using PostgreSQL 7.3 or later, the customers table is created in the first schema in your search path. (If you are using a version older than 7.3, your copy of PostgreSQL does not support schemas). If you want the table to be created in some other schema, you can prefix the table name with the schema qualifier, for example:
CREATE TABLE joes_video.customers( ... );
The new table is owned by you. You can't give ownership to another user at the time you create the table, but you can change it later using the ALTER TABLE...OWNER TO command (described later).
Temporary Tables
I mentioned earlier that the customers table is a permanent table. You can also create temporary tables. A permanent table persists after you terminate your PostgreSQL session; a temporary table is automatically destroyed when your PostgreSQL session ends. Temporary tables are also local to your session, meaning that other PostgreSQL sessions can't see temporary tables that you create. Because temporary tables are local to each session, you don't have to worry about colliding with the name of a table created by another session.
If you create a temporary table with the same name as a permanent table, you are effectively hiding the permanent table. For example, let's create a temporary table that hides the permanent customers table:
CREATE TEMPORARY TABLE customers ( customer_id INTEGER UNIQUE, customer_name VARCHAR(50), phone CHAR(8), birth_date DATE, balance DECIMAL(7,2) );
Notice that the only difference between this command and the command that you used to create the permanent customers table is the TEMPORARY keyword5. Now you have two tables, each named customers. If you now SELECT from or INSERT into the customers table, you will be working with the temporary table. Prior to version 7.3, there was no way to get back to the permanent table except by dropping the temporary table:
movies=# SELECT * FROM customers; customer_id | customer_name | phone | birth_date | balance -------------+----------------------+----------+------------+--------- 1 | Jones, Henry | 555-1212 | 1970-10-10 | 0.00 2 | Rubin, William | 555-2211 | 1972-07-10 | 15.00 3 | Panky, Henry | 555-1221 | 1968-01-21 | 0.00 4 | Wonderland, Alice N. | 555-1122 | 1969-03-05 | 3.00 8 | Wink Wankel | 555-1000 | 1988-12-25 | 0.00 (5 rows) movies=# CREATE TEMPORARY TABLE customers movies-# ( movies(# customer_id INTEGER UNIQUE, movies(# customer_name VARCHAR(50), movies(# phone CHAR(8), movies(# birth_date DATE, movies(# balance DECIMAL(7,2) movies(# ); CREATE movies=# SELECT * FROM customers; customer_id | customer_name | phone | birth_date | balance -------------+----------------------+----------+------------+--------- (0 rows) movies=# DROP TABLE customers; DROP movies=# SELECT * FROM customers; customer_id | customer_name | phone | birth_date | balance -------------+----------------------+----------+------------+--------- 1 | Jones, Henry | 555-1212 | 1970-10-10 | 0.00 2 | Rubin, William | 555-2211 | 1972-07-10 | 15.00 3 | Panky, Henry | 555-1221 | 1968-01-21 | 0.00 4 | Wonderland, Alice N. | 555-1122 | 1969-03-05 | 3.00 8 | Wink Wankel | 555-1000 | 1988-12-25 | 0.00 (5 rows)
Starting with release 7.3, you can access the permanent table by including the name of the schema where the permanent table resides.
A temporary table is like a scratch pad. You can use a temporary table to accumulate intermediate results. Quite often, you will find that a complex query can be formulated more easily by first extracting the data that interests you into a temporary table. If you find that you are creating a given temporary table over and over again, you might want to convert that table into a view. See the section titled "Using Views" in Chapter 1, "Introduction to PostgreSQL and SQL," for more information about views.
Table Constraints
In Chapter 2 we explored the various constraints that you can apply to a column: NOT NULL, UNIQUE, PRIMARY KEY, REFERENCES, and CHECK(). You can also apply constraints to a table as a whole or to groups of columns within a table.
First, let's look at the CHECK() constraint. The syntax for a CHECK() constraint is
[CONSTRAINT constraint-name] CHECK( boolean-expression )
When you define a CHECK() constraint for a table, you are telling PostgreSQL that any insertions or updates made to the table must satisfy the boolean-expression given within the constraint. The difference between a column constraint and a table constraint is that a column constraint should refer only to the column to which it relates. A table constraint can refer to any column in the table.
For example, suppose that you had an orders table to track customer orders:
CREATE TABLE orders ( customer_number INTEGER, part_number CHAR(8), quantity_ordered INTEGER, price_per_part DECIMAL(7,2) );
You could create a table-related CHECK() constraint to ensure that the extended price (that is, quantity_ordered times price_per_part) of any given order is at least $5.00:
CREATE TABLE orders ( customer_number INTEGER, part_number CHAR(8), quantity_ordered INTEGER, price_per_part DECIMAL(7,2), CONSTRAINT verify_minimum_order CHECK (( price_per_part * quantity_ordered) >= 5.00::DECIMAL ) );
Each time a row is inserted into the orders table (or the quantity_ordered or price_per_part columns are updated), the verify_minimum_order constraint is evaluated. If the expression evaluates to FALSE, the modification is rejected. If the expression evaluates to TRUE or NULL, the modification is allowed.
You may have noticed that a table constraint looks very much like a column constraint. PostgreSQL can tell the difference between the two types by their placement within the CREATE TABLE statement. A column constraint is placed within a column definitionafter the column's data type and before the comma. A table constraint is listed outside of a column definition. The only tricky spot is a table constraint that follows the last column definition; you normally would not include a comma after the last column. If you want a constraint to be treated as a table constraint, be sure to include a comma following the last column definition. At the moment, PostgreSQL does not treat table constraints and column constraints differently, but in a future release it may.
Each of the table constraint varieties is related to a type of column constraint.
The UNIQUE table constraint is identical to the UNIQUE column constraint, except that you can specify that a group of columns must be unique. For example, here is the rentals table as currently defined:
CREATE TABLE rentals ( tape_id CHARACTER(8), customer_id INTEGER, rental_date DATE );
Let's modify this table to reflect the business rule that any given tape cannot be rented twice on the same day:
CREATE TABLE rentals ( tape_id CHARACTER(8), customer_id INTEGER, rental_date DATE, UNIQUE( rental_date, tape_id ) );
Now when you insert a row into the rentals table, PostgreSQL will ensure that there are no other rows with the same combination of rental_date and tape_id. Notice that I did not provide a constraint name in this example; constraint names are optional.
The PRIMARY KEY table constraint is identical to the PRIMARY KEY column constraint, except that you can specify that the key is composed of a group of columns rather than a single column.
The REFERENCES table constraint is similar to the REFERENCES column constraint. When you create a REFERENCES column constraint, you are telling PostgreSQL that a column value in one table refers to a row in another table. More specifically, a REFERENCES column constraint specifies a relationship between two columns. When you create a REFERENCES table constraint, you can relate a group of columns in one table to a group of columns in another table. Quite often, you will find that the unique identifier for a table (that is, the PRIMARY KEY) is composed of multiple columns. Let's say that the Virtual Video Store is having great success and you decide to open a second store. You might want to consolidate the data for each store into a single database. Start by creating a new table:
CREATE TABLE stores ( store_id INTEGER PRIMARY KEY, location VARCHAR );
Now, change the definition of the customers table to include a store_id for each customer:
CREATE TABLE customers ( store_id INTEGER REFERENCES stores( store_id ), customer_id INTEGER UNIQUE, customer_name VARCHAR(50), phone CHAR(8), birth_date DATE, balance DECIMAL(7,2), PRIMARY KEY( store_id, customer_id ) );
The store_id column in the customers table refers to the store_id column in the stores table. Because store_id is the primary key to the stores table, you could have written the REFERENCES constraint in either of two ways:
store_id INTEGER REFERENCES stores( store_id )
or
store_id INTEGER REFERENCES stores
Also, notice that the primary key for this table is composed of two columns: store_id and customer_id. I can have two customers with the same customer_id as long as they have different store_ids.
Now you have to change the rentals table as well:
CREATE TABLE rentals ( store_id INTEGER, tape_id CHARACTER(8), customer_id INTEGER, rental_date DATE, UNIQUE( rental_date, tape_id ) FOREIGN KEY( store_id, customer_id ) REFERENCES customers );
The customers table has a two-part primary key. Each row in the rentals table refers to a row in the customers table, so the FOREIGN KEY constraint must specify a two-part foreign key. Again, because foreign key refers to the primary key of the customers table, I can write this constraint in either of two forms:
FOREIGN KEY( store_id, customer_id ) REFERENCES customers( store_id, customer_id )
or
FOREIGN KEY( store_id, customer_id ) REFERENCES customers
Now that I have the referential integrity constraints defined, they will behave as described in the Column Constraints section of Chapter 2, "Working with Data in PostgreSQL." Remember, a table constraint functions the same as a column constraint, except that table constraints can refer to more than one column.
Dropping Tables
Dropping a table is much easier than creating a table. The syntax for the DROP TABLE command is
DROP TABLE table-name [, ...];
If you are using PostgreSQL 7.3 or later, you can qualify the table name with a schema. For example, here is the command to destroy the rentals table:
DROP TABLE rentals;
If the rentals table existed in some schema other than your current schema, you would qualify the table name:
DROP TABLE sheila.rentals;
You can destroy a table only if you are the table's owner or if you are a PostgreSQL superuser. Notice that I used the word destroy here rather than drop. It's important to realize that when you execute a DROP TABLE command, you are destroying all the data in that table.
PostgreSQL has a nice feature that I have not seen in other databases: You can roll back a DROP TABLE command. Try the following experiment. First, let's view the contents of the tapes table:
movies=# SELECT * FROM tapes; tape_id | title | dist_id ----------+---------------+--------- AB-12345 | The Godfather | 1 AB-67472 | The Godfather | 1 MC-68873 | Casablanca | 3 OW-41221 | Citizen Kane | 2 AH-54706 | Rear Window | 3 (5 rows)
Now, start a multistatement transaction and destroy the tapes table:
movies=# BEGIN WORK; BEGIN movies=# DROP TABLE tapes; NOTICE: DROP TABLE implicitly drops referential integrity trigger from table "rentals" NOTICE: DROP TABLE implicitly drops referential integrity trigger from table "distributors" NOTICE: DROP TABLE implicitly drops referential integrity trigger from table "distributors" DROP
If you try to SELECT from the tapes table, you'll find that it has been destroyed:
movies=# SELECT * FROM tapes; ERROR: Relation "tapes" does not exist
If you COMMIT this transaction, the table will permanently disappear; let's ROLLBACK the transaction instead:
movies=# ROLLBACK; ROLLBACK
The ROLLBACK threw out all changes made since the beginning of the transaction, including the DROP TABLE command. You should be able to SELECT from the tapes table again and see the same data that was there before:
movies=# SELECT * FROM tapes; tape_id | title | dist_id ----------+---------------+--------- AB-12345 | The Godfather | 1 AB-67472 | The Godfather | 1 MC-68873 | Casablanca | 3 OW-41221 | Citizen Kane | 2 AH-54706 | Rear Window | 3 (5 rows)
This is a very nice feature. You can roll back CREATE TABLE, DROP TABLE, CREATE VIEW, DROP VIEW, CREATE INDEX, DROP INDEX, and so on. I'll discuss transactions a bit later in this chapter. For now, I'd like to point out a few details that I glossed over in the previous example. You may have noticed that the DROP TABLE command produced a few NOTICES.
movies=# DROP TABLE tapes; NOTICE: DROP TABLE implicitly drops referential integrity trigger from table "rentals" NOTICE: DROP TABLE implicitly drops referential integrity trigger from table "distributors" NOTICE: DROP TABLE implicitly drops referential integrity trigger from table "distributors" DROP
When you drop a table, PostgreSQL will automatically DROP any indexes defined for that table as well as any triggers or rules. If other tables refer to the table that you dropped (by means of a REFERENCE constraint), PostgreSQL will automatically drop the constraints in the other tables. However, any views that refer to the dropped table will not be removeda view can refer to many tables and PostgreSQL would not know how to remove a single table from a multitable SELECT.
Inheritance
Another PostgreSQL feature that is uncommon in relational database systems is inheritance. Inheritance is one of the foundations of the object-oriented programming paradigm. Using inheritance, you can define a hierarchy of related data types (in PostgreSQL, you define a hierarchy of related tables). Each layer in the inheritance hierarchy represents a specialization of the layer above it6.
Let's look at an example. The Virtual Video database defines a table that stores information about the tapes that you have in stock:
movies=# \d tapes Column | Type | Modifiers ---------+-----------------------+----------- tape_id | character(8) | not null title | character varying(80) | dist_id | integer | Primary key: tapes_pkey
For each tape, you store the tape_id, title, and distributor id. Let's say that you decide to jump into the twenty-first century and rent DVDs as well as videotapes. You could store DVD records in the tapes table, but a tape and a DVD are not really the same thing. Let's create a new table that defines the characteristics common to both DVDs and videotapes:
CREATE TABLE video ( video_id CHARACTER(8) PRIMARY KEY, title VARCHAR(80), dist_id INTEGER );
Now, create a table to hold the DVDs. For each DVD you have in stock, you want to store everything in the video table plus a region_id and an array of audio_tracks. Here is the new table definition:
movies=# CREATE TABLE dvds movies-# ( movies(# region_id INTEGER, movies(# audio_tracks VARCHAR[] movies(# ) INHERITS ( video );
Notice the last line in this command: You are telling PostgreSQL that the dvds table inherits from the video table. Now let's INSERT a new DVD:
movies=# INSERT INTO dvds VALUES movies=# ( movies(# 'ASIN-750', -- video_id movies(# 'Star Wars - The Phantom Menace', -- title movies(# 3, -- dist_id movies(# 1, -- region_id movies(# '{English,Spanish}' -- audio_tracks movies(# );
Now, if you SELECT from the dvds table, you'll see the information that you just inserted:
title_id | title | dist_id | region | audio_tracks ---------+--------------------------------+---------+--------+----------------- ASIN-750 | Star Wars - The Phantom Menace | 3 | 1 | {English,Spanish}
At this point, you might be thinking that the INHERITS clause did nothing more than create a row template that PostgreSQL copied when you created the dvds table. That's not the case. When we say that dvds inherits from video, we are not simply saying that a DVD is like a video, we are saying that a DVD is a video. Let's SELECT from the video table now; remember, you haven't explicitly inserted any data into the video table, so you might expect the result set to be empty:
movies=# SELECT * FROM video; video_id | title | dist_id ---------+--------------------------------+-------- ASIN-750 | Star Wars - The Phantom Menace | 3 (1 row)
A DVD is a video. When you SELECT from the video table, you see only the columns that comprise a video. When you SELECT from the dvds table, you see all the columns that comprise a DVD. In this relationship, you say that the dvd table specializes7 the more general video table.
If you are using a version of PostgreSQL older than 7.2, you must code this query as SELECT * FROM video* to see the DVD entries. Starting with release 7.2, SELECT will include descendent tables and you have to say SELECT * FROM ONLY video to suppress descendents.
You now have a new table to track your DVD inventory; let's go back and redefine the tapes table to fit into the inheritance hierarchy. For each tape, we want to store a video_id, a title, and a distributor_id. This is where we started: the video table already stores all this information. You should still create a new table to track videotapesat some point in the future, you may find information that relates to a videotape, but not to a DVD:
movies=# CREATE TABLE tapes ( ) INHERITS( video ); CREATE
This CREATE TABLE command creates a new table identical in structure to the video table. Each row in the tapes table will contain a video_id, a title, and a dist_id. Insert a row into the tapes table:
movies=# INSERT INTO tapes VALUES movies-# ( movies(# 'ASIN-8YD', movies(# 'Flight To Mars(1951)', movies(# 3 movies(# ); INSERT
When you SELECT from the tapes table, you should see this new row:
movies=# SELECT * FROM tapes; title_id | title | dist_id ----------+----------------------+-------- ASIN-8YD | Flight To Mars(1951) | 3 (1 row)
And because a tape is a video, you would also expect to see this row in the video table:
movies=# SELECT * FROM video; video_id | title | dist_id ---------+--------------------------------+-------- ASIN-750 | Star Wars - The Phantom Menace | 3 ASIN-8YD | Flight To Mars(1951) | 3 (2 rows)
Now here's the interesting part. A DVD is a videoany row that you add to the dvds table shows up in the video table. A tape is a videoany row that you add to the tapes table shows up in the video table. But a DVD is not a tape (and a tape is not a DVD). Any row that you add to the dvds table will not show up in the tapes table (and vice versa).
If you want a list of all the tapes you have in stock, you can SELECT from the tapes table. If you want a list of all the DVDs in stock, SELECT from the dvds table. If you want a list of all videos in stock, SELECT from the videos table.
In this example, the inheritance hierarchy is only two levels deep. PostgreSQL imposes no limit to the number of levels that you can define in an inheritance hierarchy. You can also create a table that inherits from multiple tablesthe new table will have all the columns defined in the more general tables.
I should caution you about two problems with the current implementation of inheritance in PostgreSQL. First, indexes are not shared between parent and child tables. On one hand, that's good because it gives you good performance. On the other hand, that's bad because PostgreSQL uses an index to guarantee uniqueness. That means that you could have a videotape and a DVD with the same video_id. Of course, you can work around this problem by encoding the type of video in the video_id (for example, use a T for tapes and a D for DVDs). But PostgreSQL won't give you any help in fixing this problem. The other potential problem with inheritance is that triggers are not shared between parent and child tables. If you define a trigger for the topmost table in your inheritance hierarchy, you will have to remember to define the same trigger for each descendant.
We have redefined some of the example tables many times in the past few chapters. In a real-world environment, you probably won't want to throw out all your data each time you need to make a change to the definition of an existing table. Let's explore a better way to alter a table.
ALTER TABLE
Now that you have a video table, a dvds table, and a tapes table, let's add a new column to all three tables that you can use to record the rating of the video (PG, G, R, and so on).
You could add the rating column to the tapes table and to the dvds table, but you really want the rating column to be a part of every video. The ALTER TABLE ... ADD COLUMN command adds a new column for you, leaving all the original data in place:
movies=# ALTER TABLE video ADD COLUMN rating VARCHAR; ALTER
Now, if you look at the definition of the video table, you will see the new column:
movies=# \d video Table "video" Column | Type | Modifiers ----------+-----------------------+----------- title_id | character(8) | not null title | character varying(80) | dist_id | integer | rating | character varying | Primary key: video_pkey
After the ALTER TABLE command completes, each row in the video table has a new column; the value of every rating column will be NULL. Because you have changed the definition of a video, and a DVD is a video, you might expect that the dvds table will also contain a rating column:
movies=# \d dvds Table "dvds" Column | Type | Modifiers --------------+-----------------------+------------------- title_id | character(8) | not null title | character varying(80) | dist_id | integer | region | integer | audio_tracks | character varying[] | rating | character varying(8) |
Similarly, the tapes table will also inherit the new rating column:
movies=# \d dvds Table "tapes" Column | Type | Modifiers ----------+-----------------------+------------------- title_id | character(8) | not null title | character varying(80) | dist_id | integer | rating | character varying(8) |
The ALTER TABLE command is useful when you are in the development stages of a project. Using ALTER TABLE, you can add new columns to a table, define default values, rename columns (and tables), add and drop constraints, and transfer ownership. The capabilities of the ALTER TABLE command seem to grow with each new release, see the PostgreSQL Reference Manual for more details.