MySQL Data Types
Once you have identified all of the tables and columns that the database will need, you should determine each field's MySQL data type. When creating the database, as you will do in the next chapter, MySQL requires that you define what sort of information each field will contain. There are three primary categories, which is true for almost every database software:
- Text
- Numbers
- Dates and times
Within each of these, there are a number of variantssome of which are MySQL-specificyou can use. Choosing your column types correctly not only dictates what information can be stored and how, but also affects the database's overall performance. Table 3.2 lists most of the available types for MySQL, how much space they take up, and a brief description.
Table 3.2 Here are most of the available column types for use with MySQL databases.
MySQL Datatypes |
||
Ty p e |
S i z e |
D e s c r i p t i o n |
CHAR[Length] |
Length bytes |
A fixed-length field from 0 to 255 characters long. |
VARCHAR(Length) |
String length + 1 bytes |
A fixed-length field from 0 to 255 characters long. |
TINYTEXT |
String length + 1 bytes |
A string with a maximum length of 255 characters. |
TEXT |
String length + 2 bytes |
A string with a maximum length of 65,535 characters. |
MEDIUMTEXT |
String length + 3 bytes |
A string with a maximum length of 16,777,215 characters. |
LONGTEXT |
String length + 4 bytes |
A string with a maximum length of 4,294,967,295 characters. |
TINYINT[Length] |
1 byte |
Range of -128 to 127 or 0 to 255 unsigned. |
SMALLINT[Length] |
2 bytes |
Range of -32,768 to 32,767 or 0 to 65535 unsigned. |
MEDIUMINT[Length] |
3 bytes |
Range of -8,388,608 to 8,388,607 or 0 to 16,777,215 unsigned. |
INT[Length] |
4 bytes |
Range of -2,147,483,648 to 2,147,483,647 or 0 to 4,294,967,295 unsigned. |
BIGINT[Length] |
8 bytes |
Range of -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 or 0 to 18,446,744,073,709,551,615 unsigned. |
FLOAT |
4 bytes |
A small number with a floating decimal point. |
DOUBLE[Length, Decimals] |
8 bytes |
A large number with a floating decimal point. |
DECIMAL[Length, Decimals] |
Length + 1 or Length + 2 bytes |
A DOUBLE stored as a string, allowing for a fixed decimal point. |
DATE |
3 bytes |
In the format of YYYY-MM-DD. |
DATETIME |
8 bytes |
In the format of YYYY-MM-DD HH:MM:SS. |
TIMESTAMP |
4 bytes |
In the format of YYYYMMDDHHMMSS; acceptable range ends inthe year 2037. |
TIME |
3 bytes |
In the format of HH:MM:SS |
ENUM |
1 or 2 bytes |
Short for enumeration, which means that each column can haveone of several possible values. |
SET |
1, 2, 3, 4, or 8 bytes |
Like ENUM except that each column can have more than one ofseveral possible values. |
Many of the types can take an optional Length attribute, limiting their size (the square brackets, [], indicate an optional parameter to be put in parentheses, while parentheses themselves indicate required arguments). Further, the number types can be UNSIGNEDlimiting the column to positive numbers or zeroor be defined as ZEROFILL, which means that any extra room will be padded with zeroes (ZEROFILLs are also automatically UNSIGNED). The various date types have all sorts of unique behaviors, which are documented in the manual at www.mysql.com/doc/D/A/DATETIME.html. You'll primarily use the DATE and TIME fields without modification, so you need not worry too much about their intricacies. There are also two extensions of the TEXT types that result in a different behaviorENUM and SETwhich allow you to define a series of acceptable values when creating the table. An ENUM field can have only one of a possible several thousand values, while SET allows for several of up to 64 possible values. There are two caveats with ENUM and SET: These types are not supported by other databases, and their usage undermines normalization.
To choose your data types:
Identify whether a column should be a text, number, or date type.
Choose the most appropriate subtype for each column.
Fixed-length fields (such as CHAR) are generally faster than variable-length fields (such as VARCHAR), but they also take up more disk space. See the side-bar for more information.
The size of any field should be restricted to the smallest possible value, based upon what the largest possible input could be. For example, if the largest a number such as Client ID could be is in the hundreds, set the column as an unsigned three-digit SMALLINT (allowing for up to 999 values).
You should keep in mind that if you insert a string five characters long into a CHAR(2) field, the final three characters will be truncated. This is true for any field in which the length is set (CHAR, VARCHAR, INT, etc.).
Set the maximum length for text and number columns as well as other attributes such as UNSIGNED (Table 3.3).
Rather than going over how I defined all 21 columns and why, I've listed the properties I came up with in Table 3.3. Different developers have different preferences, but the most important factor is to tailor each setting to the information at hand rather than using generic (and inefficient) TEXT and INT types at all times.
This is normally an easy and obvious step. You will find that numbers such as ZIP codes and dollar amounts should be text fields if you include their corresponding punctuation (dollar signs, commas, and hyphens), but you'll get better results if you store them as numbers and address the formatting elsewhere.
For improved performance, keep in mind two considerations:
Table 3.3 An often overlooked aspect of database design is defining the optimal type for each field.
Accounting Database |
||
C o l um n N a m e |
Ta b l e |
Co lum n Ty pe |
Invoice Number |
Invoices |
SMALLINT(4) UNSIGNED |
Client ID |
Invoices |
SMALLINT(3) UNSIGNED |
Invoice Date |
Invoices |
DATE |
Invoice Amount |
Invoices |
DECIMAL(10,2) UNSIGNED |
Invoice Description |
Invoices |
TINYTEXT |
Client ID |
Clients |
SMALLINT(3) UNSIGNED |
Client Name |
Clients |
VARCHAR(40) |
Client Street Address |
Clients |
VARCHAR(80) |
Client City |
Clients |
VARCHAR(30) |
Client State |
Clients |
CHAR(2) |
Client Zip |
Clients |
MEDIUMINT(5) UNSIGNED |
Client Phone |
Clients |
VARCHAR(14) |
Contact Name |
Clients |
VARCHAR(40) |
Contact Email Address |
Clients |
VARCHAR(60) |
Expense ID |
Expenses |
SMALLINT(4) UNSIGNED |
Expense Category ID |
Expenses |
TINYINT(3) UNSIGNED |
Expense Amount |
Expenses |
DECIMAL(10,2) UNSIGNED |
Expense Description |
Expenses |
TINYTEXT |
Expense Date |
Expenses |
DATE |
Expense Category ID |
Expense Categories |
TINYINT(3) UNSIGNED |
Expense Category |
Expense Categories |
VARCHAR(30) |