Skip to content

Anshumani Ruddra

Understated Hyperbole

Menu
  • About Me
  • Videos and Talks
  • Story Stack
  • Fiction
  • Courses
  • Playbooks
Menu

The Correlation Causation Conundrum (Alliteration Ahoy!)

Posted on February 17, 2015January 20, 2020 by Anshumani Ruddra

Growing up – all the way to engineering school and beyond – I was obsessed with mathematical modelling and statistics. The ability to model (correctly) the predictive behaviour of a system (oftentimes a complex system that involves a fairly large number of variables) can seem like pure magic to an outsider (and is a whole lot of fun science to an insider).

My most cherished memories as a game designer all involve working with humongous spreadsheets – tuning complex game economies, knowing that one mistake could lead to hundreds of thousands of dollars worth of loss, tying up dozens of loose ends and running countless simulations to correctly predict the behaviour of millions of players. Virtual economies of large games over time start looking like the economies of small countries – and a game designer/ product manager starts resembling the Chairman of the Federal Reserve/ Governor of the Reserve Bank of India. Exhilarating!

So when I see mathematics and especially statistics being used incorrectly – it rankles me to the core.

“A glass of wine a day keeps heart disease away.”

“People with black hair more likely to turn to a life of crime.”

“Martians are attacking vineyards across the country.”

“People with black hair who drink wine are prime targets for alien abduction and a heart-disease-free life of crime on Mars.”

You get the gist: a research institute (funded by some kind of a lobby) will run a study with a sample set of people (usually of the order of a few hundred that they would inflate to a few thousand). These people will be asked to do one action repeatedly (drink wine/ beer) over time and another variable (their blood pressure/ height/ length of toe-nails) will be recorded over the same period by a researcher.

At the end of the study, the researcher will compute the correlation between the repeated action and the variable and publish his/ her findings in a journal. A journalist will come across this paper – will refuse to go over the details – and directly jump to the conclusion section. And we will see a sensationalist headline the next day in every newspaper, website, blog across the world.

People will take this headline as a cardinal truth and vow to change their lives accordingly (start consuming beer, stop wearing undergarments, stop showering, etc).

Randall Munroe sums it up beautifully in XKCD:

http://xkcd.com/882/

I am not trying to discount the work of serious researchers and scientists here. Statistics is a very tough business – especially in the real world:

  • Experiments cannot be conducted in a controlled environment (and a true A/B test requires a highly controlled environment)
  • Too many variables exist that cannot be kept constant so that one can observe the interplay just between two selected variables
  • Sample sizes remain small because large sampling is costly and not scalable
  • Law of large numbers cannot be applied if the sample size is small

However, I see a lot of these realities being ignored – not just in research, but also in the world of technology product management (that actually has much better instrumentation and data gathering tools).

Example: Start-up XYZ releases a new feature in their app that has about 10,000 daily active users (DAU). A week after the feature release the DAU jumps to 15,000 suddenly. The team is elated. They run a couple of quick queries on their analytics system. They see a couple of different metrics on the rise and are convinced that it was the feature that caused the jump in DAU. What else could it be? The VP of product immediately calls a meeting between all functional leads: “We need to put everything else on hold. Forget the product roadmap that we took a whole month to arrive at. This new feature is our future. Let’s commit all our resources to this.” Everyone nods in agreement.

Three months later the start-up is dead.

“Correlation does not equal Causation.”

People will point to scrappy start-ups that did exactly this and became huge. Words like “pivot” and “experimentation” will be thrown around a lot.

People only remember the exceptions. People are usually wrong. (I completely get the irony of that statement given what I am discussing in this post.)

A vast majority of product teams forget the fundamentals. Be wary of strong positive correlations between a feature and the rise in a particular product metric (active users, retention, engagement, etc). Without correctly setting up an experiment and validating a product hypothesis one cannot jump to conclusions.

The correct experimental method involves the following steps:

– Come up with a hypothesis. “What will this feature achieve?” “What metrics will it move and by how much?” Having a clear purpose for an experiment and estimating expected results are both critical.

– The second step is to set up the experiment correctly. Do we have the ability to measure the results? What are we gauging/ matching the results against? Do we have a sturdy A/B testing platform (and how many people in the team have a clear understanding of what an A/B test is)?

– The third step is to draw inferences from the results – the actuals versus the expected. Inferences are drawn using a mix of instincts and numbers. Though between instincts and numbers most product teams almost always seem to err on the side of instincts. This is really bad. A tech start-up needs to find a balance between instincts and numbers.

I think this strip best sums up this post:

http://xkcd.com/552/


By clicking submit, you agree to share your email address with the site owner and Mailchimp to receive marketing, updates, and other emails from the site owner. Use the unsubscribe link in those emails to opt out at any time.

Processing…
Success! You're on the list.
Whoops! There was an error and we couldn't process your subscription. Please reload the page and try again.

Share this:

  • Click to share on Twitter (Opens in new window)
  • Click to share on Facebook (Opens in new window)
  • Click to share on WhatsApp (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)

Related

1 thought on “The Correlation Causation Conundrum (Alliteration Ahoy!)”

  1. Pingback: 38 Lessons over 38 Years - Anshumani Ruddra

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

By clicking submit, you agree to share your email address with the site owner and Mailchimp to receive marketing, updates, and other emails from the site owner. Use the unsubscribe link in those emails to opt out at any time.

Processing…
Success! You're on the list.
Whoops! There was an error and we couldn't process your subscription. Please reload the page and try again.

Recent Posts

  • A Personal AI Manifesto
  • Father of Man
  • A Year of Quizzing – 2024 Retrospective

Recent Comments

  • A Personal AI Manifesto - Anshumani Ruddra on Career Choices in the Wisdom Economy
  • Venkata Subba Raju on Rethinking the Minimum Viable Product (MVP)
  • Shortlisting Ideas - Anshumani Ruddra on Generating Ideas

Archives

  • April 2025
  • February 2025
  • January 2025
  • December 2024
  • October 2024
  • February 2024
  • March 2023
  • February 2023
  • January 2023
  • November 2022
  • July 2021
  • June 2021
  • April 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • July 2020
  • June 2020
  • April 2020
  • March 2020
  • February 2020
  • January 2020
  • January 2016
  • July 2015
  • March 2015
  • February 2015
©2025 Anshumani Ruddra | Built using WordPress and Responsive Blogily theme by Superb