📢 BLACK FRIDAY 50% SALE 📢
FORUM Product Analytics Lesson 190: Calculating retention curves

Lesson 190: Calculating retention curves

In this code where we setup to capture retention, i know we join the table to itself and specify that the left side table has to be a signup action but we never specify what kind of action the right side (joined table) has to be.. doesnt this mean that technically we could be joining a signup action to itself? If thats the case, shouldnt we specify that the joined table’s action can’t be a signup one?

WITH user_activity AS (
  SELECT
    u.user_id,
    u.created_at::date AS signup_date,
    e.created_at::date AS activity_date,
    COUNT(*) AS events_counts
  FROM mobile_analytics.events u
  LEFT JOIN mobile_analytics.events e
    ON e.user_id = u.user_id  
  WHERE
    u.action = 'signup'  
  GROUP BY 1, 2, 3
  ORDER BY signup_date ASC, user_id ASC
)

SELECT 
  100.0 * COUNT(DISTINCT(CASE WHEN activity_date = '2018-02-08' THEN user_id END)) / COUNT(DISTINCT(user_id)) AS D7_retention_rate
FROM user_activity
WHERE 
  signup_date = '2018-02-01'

REPLIES

Hi @jy, that’s a great question! 🚀

Joining events table to itself

Indeed, we’re joining signup events to itself ✅ That’s totally OK though, because it’s an event that happens only once for each user.

Another point is the created_at timestamp of a signup event – since there’s only one signup event per user it’ll only contribute to the first day (D0) retention, which is totally fine (a signup already implies a D0 retention).

Filtering out some events

Answering your 2nd question about filtering out some events. This is a typical scenario for a more complex JOIN condition:

WITH user_activity AS (
  SELECT
    u.user_id,
    u.created_at::date AS signup_date,
    e.created_at::date AS activity_date,
    COUNT(*) AS events_counts
  FROM mobile_analytics.events u
  LEFT JOIN mobile_analytics.events e
    ON e.user_id = u.user_id  
      AND e.action NOT IN ('signup', 'some-other-event')       
  WHERE
    u.action = 'signup'  
  GROUP BY 1, 2, 3
  ORDER BY signup_date ASC, user_id ASC
)

SELECT *
FROM user_activity

Note that AND e.action NOT IN ('signup', 'some-other-event') statement in the LEFT JOIN condition here ☝

You can add more conditions with additional AND / OR operators and fine tune your JOIN condition. 🔧

Hope that helps 🚀

WRITE A REPLY