I don’t like any of the funnel or cohort analysis tools available to me

Headshot of
                Vikram Oberoi

Vikram Oberoi


December 27, 2022


· 4 min read

I don’t like any of the funnel or cohort analysis tools available to me

Three funnels upside down decorated with googly eyes, pom-poms, and pipe cleaners to look like people. Behind them is a lunch box robot and a small diorama in a crate with a bunch of small ceramic friars and a disco ball.
I do like these silly funnels, though.

Ten years ago, while starting the Data Engineering team at Harry’s, I wrote a suite of cohort analysis queries that looked more-or-less like the queries in this post.

The success of the razor and blades model hinges on high customer retention so we tracked it religiously from day one. Retention was so critical at Harry’s that I even wrote a haiku about it once.*

I wrote 15-20 permutations of cohort queries that we would analyze regularly. At Harry’s we cut our cohorts in a bunch of ways: the channel customers came from, the contents of their first order, age/gender, geography, etc. We analyzed account creation -> 1st order, 1st order -> 2nd order, 2nd order -> 3rd order separately.

There were a lot of charts. We wanted to compare them and see how they changed week-over-week and month-over-month.

I could have sent our data to Mixpanel at the time, but it didn’t allow us to explore our data this way. So I hand-rolled a lot of SQL queries and my colleague analyzed the data in Excel. It was cumbersome and time-consuming, but it was a critical analysis for the business.

This year I was retained as a fractional Head of Product for a long-time client and this summer I took a few days to dive into my client’s retention metrics. They have a loyal base of regular users and I wanted to figure out how they differed from folks who churned. So I reached for my trusty friend: the cohort analysis!

Imagine my surprise when my options were no different than they were a decade ago:

  1. I can send my data to tool like Amplitude, Mixpanel, or Heap and use its tools.
  2. I can write SQL and visualize data in Excel or a BI tool.

I didn’t like either of these options ten years ago! I still don’t.

While product analytics suites provide useful tools for exploratory analyses, I usually need to drop down to SQL to answer common follow-up questions. And ensuring accuracy is challenging: the data that ends up in these tools is frequently suspect because event instrumentation is so brittle. Last week I came across Olga Berezovsky’s feedback on Amplitude after its most recent conference and it resonates — Amplitude and its competitors don’t serve me well when I’m doing deeper analyses and I can’t rely on their accuracy.

Meanwhile, I love the flexibility I have when I’m writing SQL: I can query my data however I’d like to. I can clean it up, run spot checks, and write data quality tests to make sure my analysis is correct. But SQL queries for funnel and cohort analyses are challenging to write correctly and scale. Exploring funnels and cohorts this way — asking follow-up questions like, “Why is our March cohort so much more active than our February cohort?” — is time-consuming and I miss out on the interactivity I get from product analytics tools.

I want to be able to analyze behavioral data with the affordances that existing product analytics tools provide. But I don’t want to be shackled to those tools.

The tool I want sits in a happy middle:

I want this to exist! So I’m working on it.

If you analyze behavioral data and the above resonates, I would be delighted to speak with you. Please say hello!

*You can read about my pursuit of poetry in the workplace on my secret sandwich blog. (Currently on hiatus.)