SQL Habit Community Forum

zofranmohd

over 1 year

SOLVED

AB-TEST : Exercise 201

I am not sure why my query returned as error for exercise 201.

WITH ab_test_categorization AS (
  SELECT
    user_id,
    action,
    custom_parameters ->> 'ab_test_name' AS ab_test_name,
    custom_parameters ->> 'ab_test_variation' AS ab_test_variation,
    created_at AS categorized_at
  FROM mobile_analytics.events
  WHERE
    custom_parameters ->> 'ab_test_name' IS NOT NULL
    AND action IN ('signup', 'add-library-entry')
)    

SELECT
  ab_test_variation,
  action,
  COUNT(DISTINCT(user_id)),
  100.0 * COUNT(DISTINCT(CASE WHEN ab_test_variation = 'control' AND action = 'add-library-entry' THEN user_id END)) / COUNT(DISTINCT(CASE WHEN ab_test_variation = 'control' AND action = 'signup' THEN user_id END))
FROM ab_test_categorization
WHERE
  ab_test_name = 'longer_onboarding_201803'
GROUP BY 1, 2

The bit at the final query

100.0 * COUNT(DISTINCT(CASE WHEN ab_test_variation = 'control' AND action = 'add-library-entry' THEN user_id END)) / COUNT(DISTINCT(CASE WHEN ab_test_variation = 'control' AND action = 'signup' THEN user_id END))

returned ERROR: division by zero.

I tried to further investigate myself by checking both the numerator and denominator part of the calculation independently and it works fine. I managed to answer the question by doing manually calculation from the table without the line of calculation that I got an error above.

Can you explain the error that I made above?

2

Join the discussion

Alex

over 1 year

SOLVED

When to categorize or assign users in A/B testing?

AB-test categorization. Part 1 - When categorization happens

When it’s mentioned that “A great way to categorize users is during the signup process”, is this specifically for testing onboarding flow changes or all changes?

The lesson also cautions against “categorizing users when they visit this page or screen” that’s subject to AB test changes. But why? Wouldn’t you want to assign users to their A/B variant at the point of change as not all users will go through the funnel and you risk requiring a larger sample size? The concern I’m flagging is also outlined here: https://medium.com/@foundinblank/why-it-matters-where-you-randomize-users-in-a-b-experiments-5570c7585944.

1

Join the discussion

Alex

over 1 year

SOLVED

Correct logic of measuring activation rate without analytics events?

Many thanks, the exercises have been interesting and useful!

So I tried to recreate the results for Exercise 185 Activation rate per country but without using the mobile_analytics.events table.

The results can’t be replicated due to differences in tables i.e. number of unique users in mobile_analytics.events table: 2508 number of unique iphone users in devices table: 4223

But is the logic of my query attempt correct?

WITH mob_readers_and_users AS (
  SELECT 
    country,
    COUNT(DISTINCT CASE WHEN last_page > 0 THEN d.user_id END) AS reader_num, 
    COUNT(DISTINCT d.user_id) AS user_num
  FROM users u
  LEFT JOIN books_users b
    ON u.id = b.user_id
  LEFT JOIN devices d 
    ON u.id = d.user_id
  WHERE 
    device_type = 'iphone'
  GROUP BY 1
)

SELECT 
  country, 
  reader_num, 
  ROUND(100.0 * reader_num / user_num, 2) AS activation_rate
FROM mob_readers_and_users
ORDER BY 3 DESC

4

Join the discussion

Alex

over 1 year

SOLVED

Choice of numerator for calculating conversion rate?

I got the right answer (and same SQL results) for the exercise even though I counted event action ‘Signup’ instead of DISTINCT pageview_id.

As a rule of thumb, I understand that it’s preferable to count unique identifiers for conversion rates. But are there any risks to using my method below?

My query is below for reference:

SELECT p.device_type, 100.0* COUNT(action) / COUNT(DISTINCT visitor_id) AS cvr
FROM web_analytics.pageviews p
LEFT JOIN
web_analytics.events e
ON p.pageview_id = e.pageview_id
AND e.action = 'Signup'
WHERE url LIKE '%/books/%'
GROUP BY 1
ORDER BY 2 DESC

1

Join the discussion

adolfo

over 1 year

SOLVED

Counting number of customers per country in a month

Hello everyone, I am in exercise 30: counting number of customers per country in a month.

The solution to the exercise has this query:

SELECT
  COUNT(*)
FROM users
WHERE 
  signup_date BETWEEN '2018-01-01' AND '2018-01-31'
  AND status = 'customer'
  AND country = 'in'

but I performed a different query:

SELECT
  COUNT(CASE WHEN status ='customer' THEN id END) AS count_customer_in      
FROM users
WHERE country = 'in'
      AND
      signup_date BETWEEN '2018-01-01' AND '2018-01-31';

that led me to the result.

My question is: Would it be good practice to query it in this last way? Regards.

2

Join the discussion

isik

almost 2 years

SOLVED

How to calculate button CTR

Hi Anatoli!

My query did take me to the right answer but the CTR values weren’t the same as your query.

My query was like this:

SELECT
  device_type,
  (100.0 * count(distinct s.pageview_id) / count(distinct p.visitor_id)) ctr
FROM web_analytics.pageviews p
LEFT JOIN (select pageview_id from web_analytics.events where category = 'Signup Button' and action = 'Click') s 
  ON p.pageview_id = s.pageview_id
where p.url like '%books%'
group by 1
order by 2 desc

Could you please explain what I’m missing? Thank you!

1

Join the discussion

Kris

about 2 years

SOLVED

Count(*) vs. Count(Column)

On exercise number 40….. newbie question…… the solution used COUNT() , but I use COUNT(book_id) (because i think the question is about the book lol)… is there a rule when to use COUNT() or when to use COUNT(specific column) or they will be always the same?

And what if there is a NULL value in that specific column, so COUNT(*) will always be the better choice?

1

Join the discussion

vrushali

over 2 years

SOLVED

Mobile analytics - Chpt 183 & 184 Soft activation rate

I am slightly confused about the soft activation calculation.

In chpt 183, soft activation rate is calculated by Count cohort users who have a specific event divided by Count all cohort users.

But in chpt 184, it is calculated as Count cohort users who have a specific event divided by signup users.

As per my understanding specific users event should be divided with signup users.

Can you please clarify this?

2

Join the discussion

madalina

over 2 years

SOLVED

Group By error

I created the following query for the exercise on lesson 114. Detecting campaign with the highest CPC:

SELECT
  utm_campaign,
  amount / clicks AS CPC
FROM marketing_spends
GROUP BY 1
ORDER BY 2 DESC

However this returns an error: column marketing_spends.amount must appear in the GROUP BY.

Why is it required to include it on the group by clause? Seems that the following query runs without errors:

SELECT
  utm_campaign,
  amount / clicks AS CPC
FROM marketing_spends
GROUP BY 1, 2
ORDER BY 2 DESC

Thank you!

2

Join the discussion

vrushali

over 2 years

SOLVED

Chapter 105. Calculating ARPU

When I run Query 1 query, I expected the query will return me only rows with refunded = False. But it also returns rows with refunded = Null.

QUERY 1

SELECT *
FROM users u
LEFT JOIN purchases p
    ON u.id = p.user_id
      AND refunded = FALSE

QUERY 2

But when I run query 2, it returns me rows with refunded = false (no null rows):

SELECT u.id, status, refunded
FROM users u
LEFT JOIN purchases p
    ON u.id = p.user_id
WHERE 
    refunded = FALSE

Can someone please explain me this?

2

Join the discussion

AB-TEST : Exercise 201

When to categorize or assign users in A/B testing?

Correct logic of measuring activation rate without analytics events?

Choice of numerator for calculating conversion rate?

144. Signup rate per device type

Counting number of customers per country in a month

How to calculate button CTR

Count(*) vs. Count(Column)

Mobile analytics - Chpt 183 & 184 Soft activation rate

Group By error

Chapter 105. Calculating ARPU

QUERY 1

QUERY 2