- Naming Database Elements
- Choosing Your Column Types
- Choosing Other Column Properties
- Accessing MySQL
- Review and Pursue
Choosing Your Column Types
Once you have identified all of the tables and columns that the database will need, you should determine each column’s data type. When you’re creating a table, MySQL requires that you explicitly state what sort of information each column will contain. There are three primary types, which is true for almost every database application:
Text (aka strings)
Numbers
Dates and times
Within each of these, there are many variants—some of which are MySQL specific. Choosing your column types correctly not only dictates what information can be stored and how, but also affects the database’s overall performance. Table 4.2 lists most of the available types for MySQL, how much space they take up, and brief descriptions of each type. Note that some of these limits may change in different versions of MySQL, and the character set (to be discussed in Chapter 6, “Database Design”) may also impact the size of the text types.
Table 4.2 MySQL Data Types
Type |
Size |
Description |
CHAR[Length] |
Length bytes |
A fixed-length field from 0 to 255 characters long |
VARCHAR[Length] |
String length + 1 or 2 bytes |
A variable-length field from 0 to 65,535 characters long |
TINYTEXT |
String length + 1 bytes |
A string with a maximum length of 255 characters |
TEXT |
String length + 2 bytes |
A string with a maximum length of 65,535 characters |
MEDIUMTEXT |
String length + 3 bytes |
A string with a maximum length of 16,777,215 characters |
LONGTEXT |
String length + 4 bytes |
A string with a maximum length of 4,294,967,295 characters |
TINYINT[Length] |
1 byte |
Range of –128 to 127 or 0 to 255 unsigned |
SMALLINT[Length] |
2 bytes |
Range of –32,768 to 32,767 or 0 to 65,535 unsigned |
MEDIUMINT[Length] |
3 bytes |
Range of –8,388,608 to 8,388,607 or 0 to 16,777,215 unsigned |
INT[Length] |
4 bytes |
Range of –2,147,483,648 to 2,147,483,647 or 0 to 4,294,967,295 unsigned |
BIGINT[Length] |
8 bytes |
Range of –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 or 0 to 18,446,744,073,709,551,615 unsigned |
FLOAT[Length, Decimals] |
4 bytes |
A small number with a floating decimal point |
DOUBLE[Length, Decimals] |
8 bytes |
A large number with a floating decimal point |
DECIMAL[Length, Decimals] |
Length + 1 or 2 bytes |
A DOUBLE stored as a string, allowing for a fixed decimal point |
DATE |
3 bytes |
In the format YYYY-MM-DD |
DATETIME |
8 bytes |
In the format YYYY-MM-DD HH:MM:SS |
TIMESTAMP |
4 bytes |
In the format YYYYMMDDHHMMSS; acceptable range starts in 1970 and ends in the year 2038 |
TIME |
3 bytes |
In the format of HH:MM:SS |
ENUM |
1 or 2 bytes |
Short for enumeration, which means that each column can have one of several possible values |
SET |
1, 2, 3, 4, or 8 bytes |
Like ENUM except that each column can have more than one of several possible values |
Many of the types can take an optional Length attribute, limiting their size. (The brackets, [ ], indicate an optional parameter to be put in parentheses.) For performance purposes, you should place some restrictions on how much data can be stored in any column. But understand that attempting to insert a string five characters long into a CHAR(2) column will result in truncation of the final three characters. Only the first two characters would be stored; the rest would be lost forever. This is true for any field in which the size is set (CHAR, VARCHAR,INT, etc.). Thus, your length should always correspond to the maximum possible value—as a number—or the longest possible string—as text—that might be stored.
The various date types have all sorts of unique behaviors, the most important of which you’ll learn about in this book. All the behaviors are documented in the MySQL manual. You’ll use the DATE and TIME fields primarily without modification, so you do not have to worry too much about their intricacies.
There are also two special types—ENUM and SET—that allow you to define a series of acceptable values for that column. An ENUM column can store only one value of a possible several thousand, whereas SET allows for several of up to 64 possible values. These are available in MySQL but aren’t present in every database application.
To select the column types:
Identify whether a column should be a text, number, or date/time type (Table 4.3).
Table 4.3 users Table
Column Name
Type
user_id
number
first_name
text
last_name
text
email
text
pass
text
registration_date
date/time
This is normally an easy and obvious step, but you want to be as specific as possible. For example, the date 2006-08-02 (MySQL format) could be stored as a string—August 2, 2006. But if you use the proper date format, you’ll have a more useful database (and, as you’ll see, there are functions that can turn 2006-08-02 into August 2, 2006).
Choose the most appropriate subtype for each column (Table 4.4).
Table 4.4 users Table
Column Name
Type
user_id
MEDIUMINT
first_name
VARCHAR
last_name
VARCHAR
email
VARCHAR
pass
CHAR
registration_date
DATETIME
For this example, user_id is set as a MEDIUMINT, allowing for up to nearly 17 million values (as an unsigned, or non-negative, number). registration_date will be a DATETIME. It can store both the date and the specific time a user registered. When deciding among the date types, consider whether you’ll want to access just the date, the time, or possibly both.
When choosing a subtype, err on the side of storing too much information.
The other fields will be mostly VARCHAR, since their lengths will differ from record to record. The only exception is the password column, which will be a fixed-length CHAR (you’ll see why when inserting records in the next chapter). See the sidebar “CHAR vs. VARCHAR” for more information on these two types.
Set the maximum length for text columns (Table 4.5).
Table 4.5 users Table
Column Name
Type
user_id
MEDIUMINT
first_name
VARCHAR(20)
last_name
VARCHAR(40)
email
VARCHAR(60)
pass
CHAR(128)
registration_date
DATETIME
The size of any field should be restricted to the smallest possible value, based on the largest possible input. For example, if a column stores a state abbreviation, it would be defined as a CHAR(2). Other times you might have to guess: I can’t think of any first names longer than about 10 characters, but just to be safe I’ll allow for up to 20.