Occam’s Razor is not only the name of a blog by Avinash Kaushik, it’s also a way of thinking that most analytics executives embrace. So what is Occam’s Razor? Occam’s Razor, also referred as the law of parsimony, dates back to the XIVth century with the work of logician William of Ockham and later popularized by the work of metaphysician Sir William Hamilton. The term “razor” refers to distinguishing between two theories either by “shaving away” unnecessary assumptions or cutting apart two similar theories. Many executives would summarize the expression as “simpler explanations are, other things being equal, generally better than more complex ones.”
However, this principle doesn’t advertise the use of aggregate-level nor the one of more individual-level insights for taking business analytics decisions. Thus, the Occam’s Razor principle would stipulate that, in some cases, aggregate results may bring sexy useful straightforward insights to the table, but in other cases, it may be more advantageous to most executives to encourage their analysts to dig deeper than only aggregate results. This principle shows no trade-off between in using aggregate vs. disaggregate insights, it rather suggests to use “common sense” to identify when you should stop digging in to obtain more parsimonious useful insights that reflect the truth in as many comparable situations as possible to avoid insights that suffers from “overfitting”.
This post first illustrates the breadth of the Occam’s Razor principle by an application to clearly distinguish two important concepts in analytics, that is to say monitoring and targeting. It thereafter applies these two concepts to two simple cases: (1) The heart attack metaphor and (2) a modified version of the Anscombe’s Quartet.
Occam’s Razor: Monitoring vs. Targeting
So how are monitoring and targeting related? Monitoring can be defined as the art of tracking customers over time to either send an alarm to take action, or to track the impact of an action that is already taken. This action is generally taken as a result of monitoring and this action should as much as possible be considered as a targeted action as long as it targets a sufficient amount of customers. Results of monitoring can thus be defined as predictors that a new targeting strategy is needed, and thereafter as antecedents to monitor the impact of this new strategy. So how all of this relates to Occam’s Razor? Monitoring tools should generally answer a “simple” goal, and should capture a broad concept. It often comprises a simple KPI or a bunch of KPIs that give you a great idea if the situation is going great or not.
Targeting tools generally needs to be more “complex”, since they need to dig deeper in the heart of the problem. They may include complex customer segmentations and even algorithms where the size of the segment is equal to 1. They are the ones who should impact the KPIs and thus should be more “advanced” models and procedures rather than simple KPIs. Thus, related to Occam’s Razor, monitoring tools have a reason of being “simple” without being too “simple”, while targeting tools have their own reason of being more “complex” without being too “complex”. This discussion is illustrated in Figure 1.
Occam’s Razor: Monitoring vs. Targeting and the Heart Attack Metaphor
Let’s illustrate the relationship between the three terms through a simple example. Let’s take for example a patient who just suffered from a heart attack and is now recovering at the hospital. The patient’s condition is mainly monitored through a “single” and “simple” but not “over-simple” KPI, its heart rate. However, the tools used to diagnose how to keep him out of the water are more complex, but once again not complex without an objective, specialists first need to target the problem to act on it.
Occam’s Razor, Monitoring, Targeting and Anscombe’s Quartet Applied to Retailing
As a more marketing-related example, let’s illustrate our three terms through the famous Anscombe’s Quartet applied to retailing. The Anscombe’s Quartet is a set of four simple two-dimensional datasets that have nearly exact statistical characteristics. To make the example more appealing to marketing executives, especially in the retail industry, I twisted the original Anscombe’s Quartet through a realistic but fictitious example for a typical supermarket chain by scaling the numbers. I took two KPIs from the retail industry: (1) the Average Basket Size (ABS) and (2) the Average Basket Value (ABV). The ABS is the average number of items the customer buy at your store in a transaction while the ABV is how much the average customer spends at your store. All of these KPIs are so important that a change in the first digit after the dot could make a huge difference on your financial bottom line. Maximizing both of these numbers is a goal every retail executive has highlighted for sure. The KPIs of my reviewed version of the Anscombe’s Quartet are presented in Table 1.
Other less useful characteristics (statistics) are also similar in all of these databases. To the marketing executive, a quick comparison of these KPIs over time would be enough and should be enough since these KPIs are not over simplistic from a monitoring perspective. Occam’s Razor would tell us to focus on the evolution of these two KPIs to monitor the progress of the supermarket chain. However, for targeting purposes these results should not be enough. Thus, if we plot the data associated to each of the four datasets that yields the same results for these two KPIs and other characteristics, we have extremely different patterns. Thus, for targeting purposes, simply plotting the data associated to each of the four datasets would provide an astonishing blast, as seen on Figure 2, where the individual-level value of ABS are represented by the X axis while those for the ABV are represented by the Y axis.
What do these results mean for marketing executives? Simply one thing, that they should make sure that analysts dig deeper and this means, looking at the visuals and way more complex relationships between variables. Thus, even though most retailing companies have extremely large database where you need complex SQL procedures to extract simple aggregate insights that are enough for monitoring, a lot more should be done when it comes to targeting, especially when it comes to take care of outliers that are “killers” for datasets 3 and 4. A lot more could be said about Anscombe’s Quartet and this modified version, but I’ll let others have fun with it by providing the data in Table 2 of Appendix 1.
So what can we learn from this post? Occam’s Razor is a principle that stipulates parsimony. However, this doesn’t mean that “targeting” should be over simplistic. Targeting tools are generally more complex than monitoring tools since they are the ones that should drive the results of the monitoring tools. Digging deeper should always be encourage for targeting while for monitoring more “simple” metrics should be defined. As an executive or an analyst, the important is simply to be aware of the role of both of these concepts and how Occam’s Razor influences them. Enjoy the Super Bowl!
 Only useful similar characteristics are reported. Other similar characteristics related to the Anscombe’s Quartet includes: (1) the correlation between ABS & ABV = 0.82, (2) the Variance of ABS = 11, (3) the variance of ABV = 72.81, (4) the Constant in Basket Value (BV) = Constant + Basket Size (BS)*Slope that is 12.60, and (5) the impact of an additional item on BV = $2.10, also known as the Slope in BV = Constant + BS*Slope.