SQL Whiteboard
Amazon
BIE People Analytics
Sample Data
1 | DROP TABLE IF EXISTS CURRENT_AVAILABLE_SUPPLY; |
- What is the available supply for each AZ, for each week in the next year?
- Step 1. UNION ALL two tables. Treat
CURRENT_AVAILABLE_SUPPLY
as initialNEW_SUPPLY
- Step 2. Cumulate
NEW_SUPPLY
then minus cumulativeNEW_DEMAND
- Remove the records from
CURRENT_AVAILABLE_SUPPLY
1 | /* |
BIE Amazon Prime Video
1 | buyer_tb: buyer_id, signup_dt |
Some customers placed an order after 30 days of registration
Some customers placed an order within 30 days of registration
Some customers never placed any orders
Place orders
- T -> within 30 days
- T
- F
- F
- T -> within 30 days
1 | DROP TABLE IF EXISTS buyer_tb; |
1 | DROP TABLE IF EXISTS order_tb; |
Q1: For each buyer, count the number of orders from each year?
Solution
1 | SELECT buyer_id, YEAR(order_dt) AS order_year, COUNT(order_id) |
Q2: list top 10 buyers by order count for each country.
Solution
1 | WITH CTE AS |
Q3: What’s the average orders for all buyers within 30 days after signup.
Get all records with datediff
1 | # Get all records with datediff |
Get all records with datediff that only <= 30
1 | # Get all records with datediff that only <= 30 |
Solution: the average number of orders for buyers who placed orders within 30 days after signup.
1 | # Solution: the average number of orders for buyers who placed orders within 30 days after signup. |
Solution: the average number of orders for all buyers.
- Consider the buyers must registered > 30 days.
1 | # Solution: the average number of orders for all buyers. |
DE
Q1. For Instance how can we return from a table like the one below following results?
Table Employee
- No PK
1 | employee_id salary_component_id salary_component_desc Value month |
Sample Result
1 | employee_id month salary bonus indemnity |
Sample Data
1 | DROP TABLE IF EXISTS Employee; |
Solution
1 | WITH CTE AS |
Q2. Question: Can you compute the average number of subscribers by customer age range using the following age buckets:
Assume we have the following table in a relational DB:
Table SUBSCRIBERS
USER_ID
DATE_OF_BIRTH
NUMBER_OF_SUBSCRIBERS
1 | user_id DOB Subs |
Table USERS
USER_ID
NAME
SURNAME
COUNTRY
Sample Result
1 | * 0-18 y.o. |
Sample Data
1 | import datetime |
1 | DROP TABLE IF EXISTS SUBSCRIBERS; |
Solution
1 | WITH CTE1 AS |
Q3. what’s the order number and the order amount for the order with the highest amount sold?
Assume we have the following three tables:
ORDERS
(order_id
PK,customer_id
,order_date
)ORDER_DETAILS
(order_id
,product_id
,price_each
,qty
)CUSTOMERS
(customer_id
,name
,surname
,address
,city
,country
)
ORDER_DETAILS
1 | order_id product_id price_each qty |
Sample data
1 | import random |
1 | DROP TABLE IF EXISTS ORDER_DETAILS; |
Solution
1 | SELECT |
Q4. what’s the order number and the order amount for the order with the highest amount sold? – get top 10 - 20 orders
Solution
1 | WITH CTE1 AS |
Q5. who are the customers who have not placed any order in February 2018?
Sample Data
1 | import datetime |
1 | DROP TABLE IF EXISTS ORDERS; |
Solution
1 | WITH CTE AS |