FORUM Product Analytics Chapter 181. Counting app releases per month

Chapter 181. Counting app releases per month

Hello,

with reference to chapter “181. Counting app releases per month” & query - “How many releases Bindle had in February, 2018?” - I’m unable to understand on how’s the following solution different from the one proposed in chapter & why isn’t this correct?

SELECT COUNT( distinct app_version)
FROM adjust.callbacks
where to_char(created_at, 'yyyy-mm') = '2018-02'

Thanks, Tushar

REPLIES

Hi @tks 👋

That’s a really great and deep question, this whole chapter is about finding events that aren’t obviously present in our data.

First of all, I’d suggest we prettify your query a little bit and follow SQL Habit’s style guide:

SELECT 
  COUNT(DISTINCT(app_version)) AS releases_count
FROM adjust.callbacks
WHERE 
  to_char(created_at, 'yyyy-mm') = '2018-02'

Now it’s a bit clearer and the result column alias tells us what do we want to calculate. We’re ready to dive deep into the problem! 🔍

Finding the bug 🐛

Let’s look at our raw Adjust callbacks data:

SELECT *
FROM adjust.callbacks

Every row in this table is already some kind of event. We can figure out the kind by looking at the activity_kind column, it has values click, install or signup.

💡 As you can see, there’s no app install event. Adjust sends us callbacks (events) only in three cases – when someone clicks our link, install our app or signs up for Bindle.

You see where it’s going? Your query ☝ counts the number of app versions that users used (clicked links, installed or signed up for) in Feb, 2018. If someone installed an app version from January it’ll be counted as a February release, so here’s our bug. Busted! 💥

Inferring release date

We know that adjust.callbacks doesn’t have app releases data, we simply don’t track such event.

Since we don’t have such an event, we need to infer it – look at our events and figure out some statistic that would tell us when the app was released.

In our case it’s the first timestamp when we see a specific app version:

WITH app_releases AS (
  SELECT 
    app_version,
    MIN(created_at) AS app_released_at
  FROM adjust.callbacks
  GROUP BY 1
)

SELECT *
FROM app_releases

For me such queries are a thing of beauty – we just produced a whole new table of events from a simingly unrelated data, isn’t it awesome? 😆

Now we just need to calculate releases from February, 2018 and the job is done. 🍻

Thank you for great questions, @tks! 🚀

Hey @makaroni4,

Thanks so much for the detailed answer & highlighting the scenario due to which my understanding wasn’t correct.

Just one more query - if I’m using a ‘where’ condition [ WHERE to_char(created_at , ‘yyyy-mm’) = ‘2018-02’] which takes care of releases Bindle had in February, 2018, then why should it count the number of app versions that users used (clicked links, installed or signed up for) in Feb, 2018. If someone installed an app version from January it’ll be counted as a February release ?

Thanks, Tushar

WRITE A REPLY

“no fuss, just things you actually need”

Start learning with SQL Habit today

Master Data Analysis with SQL through the story of how a startup succeeded through data.