Predictions vs. Decisions: Why Understanding Cause and Effect Matters in Business Decisions

Awadelrahman M. A. Ahmed
3 min readFeb 12, 2024

--

Imagine a company trying to reduce how many customers leave, a challenge known as reducing customer churn. The company’s leaders are considering investing a significant amount of money in marketing efforts to keep customers engaged.

While machine learning — using computers to analyze and find patterns in data — can identify these patterns, it’s not always clear if one action directly leads to a specific outcome. This clarity is crucial for making informed decisions on spending the marketing budget effectively.

The Problem with Just Looking at Patterns

Suppose a company observes that customers who received special deals tend to remain with the company longer. A straightforward machine learning model might interpret this pattern and suggest: “Offer more deals to reduce customer churn.”

However, this approach overlooks a critical distinction. It assumes that because two events occur together (offering deals and customers staying), one necessarily causes the other. This assumption can be misleading due to several reasons:

  • Which Came First? (a.k.a. Reverse Causality): It’s sometimes unclear whether the deals actually encourage customer loyalty, or if loyal customers are more likely to receive deals!
  • Other Factors (a.k.a. Confounding Variables): There might be unseen reasons why customers choose to stay that are not accounted for in the analysis.
  • Just a Coincidence (a.k.a. Spurious Correlation): Occasionally, two events might happen together by chance without a direct causal relationship.

To show the importance of distinguishing between correlation and causation, consider the following hypothetical distributions of customer churn rates:

  • Red Distribution (Without Intervention): This shows what happens to churn rates without any marketing interventions. It represents the baseline scenario where a higher rate of churn is expected.
  • Green Distribution (With Intervention): This displays the churn rates after applying marketing interventions. The shift towards lower churn rates indicates the potential effectiveness of these interventions.

The fundamental challenge with statistical-based ML in this context is its focus solely on the red distribution to predict outcomes. Since these models are trained on historical data without interventions, they inherently assume that future outcomes will mirror past patterns.

However, in our scenario, the goal is not merely to predict churn based on past trends but to alter future churn rates through strategic interventions. The desire is to shift the outcome from the red distribution to the green one — a goal that lies outside the scope of what we normally use in predcition.

The Value of Understanding Cause and Effect

The essence of the matter is that while correlation-based ML can offer valuable insights into existing patterns within the data, it falls short when the aim is to change those patterns through interventions. This limitation underscores the need for approaches that not only understand but can influence the distribution of outcomes.

Causal inference steps in to fill this gap. Unlike correlation-based ML, causal inference methods are designed to estimate the effects of interventions, providing insights into how actions today can change outcomes tomorrow. These methods consider the counterfactual — what would happen to the same set of customers if they were subjected to an intervention, as opposed to not — allowing businesses to make informed decisions about strategies that can genuinely reduce churn.

--

--

Awadelrahman M. A. Ahmed

Cloud Data & Analytics Architect | AWS Community Builder | MLflow Ambassador | PhD Fellow in Informatics https://www.linkedin.com/in/awadelrahman/