Choosing an Approach
As to what the customer sees, recommending products is a simple idea: If the customer is currently viewing product A, the site might recommend products C, G, and N. When viewing product B, the site might recommend, G, L, and T. But how are those associations determined? There are two possible approaches.
For small sites, with a limited range of products and/or fewer sales, the easiest way to identify recommendations is to have the store’s administrator manually do so. Besides being logically the simplest, such an approach taps into the knowledge of an actual person as to what products are best naturally paired (for cross-selling) or what is a better alternative (for an upsell). Clearly this route would not be appropriate for large or active sites, though, as trying to maintain product associations becomes exponentially more complex with each additional product being sold.
For stores with lots of products and lots of sales, there’s an alternative “knowledge base” that can be tapped: The massive order history. Sites like Netflix can do an excellent job of predicting what you might think about movie X because the site knows what you thought about movies Y and Z. By finding other users who felt the same way about movies Y and Z and have seen movie X, a reasonable guess as to how you’ll feel about X can be calculated. The same idea holds for e-commerce: If many people who purchased product A also purchased products C, G, and N, it’s reasonable to assume that products C, G, and N are good recommendations. Unfortunately, an automated system doesn’t work for upselling, because the order histories won’t show that customers bought A and B, but rather B instead of A.
Throughout the rest of the article, I’ll explain how you would implement both the “manual” and “automatic” product recommendation systems. But first, for the purposes of this article, let’s assume that every product the e-commerce site sells is represented in a table named products (see Figure 1). Further, let’s assume that general information about each sale goes into an orders table: the customer ID, the order total, the date, etc. The particulars of an order go into order_contents: Each product purchased in an order, the quantity, the price paid, etc. And the customers table represents the customers.
Figure 1 Part of the assumed database scheme.
Each customer may have one or more orders, but each order will only be for a single customer. Each order will have one or more “order contents,” but each “order content” will only be for a single order. And each “order content” will match up with exactly one product.