Retail analytics sample database
Install the PostgreSQL-compatible Retail Analytics dataset on the YugabyteDB distributed SQL database.
You can install and use the Retail Analytics sample database using:
- A local installation of YugabyteDB. To install YugabyteDB, refer to Quick Start.
- Using cloud shell or a client shell to connect to a cluster in Yugabyte Cloud. Refer to Connect to clusters in Yugabyte Cloud. To get started with Yugabyte Cloud, refer to Quick Start.
In either case, you use the YugabyteDB SQL shell (ysqlsh) CLI to interact with YugabyteDB using YSQL.
About the Retail Analytics database
The Retail Analytics dataset includes sample data in the following tables:
- Products: Product information
- Users: Customers who have bought products
- Orders: Orders made by customers
- Reviews: Product reviews
Install the Retail Analytics sample database
The Retail Analytics SQL scripts reside in the share
folder of your YugabyteDB or client shell installation. They can also be found in the sample
directory of the YugabyteDB GitHub repository. The following files will be used for this exercise:
schema.sql
— Creates the tables and constraintsorders.sql
— Loads product ordersproducts.sql
— Loads productsreviews.sql
— Loads product reviewsusers.sql
— Loads customer information
Follow the steps here to install the Retail Analytics sample database.
Open the YSQL shell
If you are using a local installation of YugabyteDB, run the ysqlsh
command from the yugabyte
root directory.
$ ./bin/ysqlsh
If you are connecting to Yugabyte Cloud, open the ysqlsh cloud shell, or run the YSQL connection string for your cluster from the yugabyte-client
bin directory.
Create a database
You can do this as shown below.
yugabyte=# CREATE DATABASE yb_demo;
yugabyte=# GRANT ALL ON DATABASE yb_demo to yugabyte;
yugabyte=# \c yb_demo;
Load data
First create the four tables necessary to store the data.
yb_demo=# \i share/schema.sql;
Now load the data into the tables.
\i share/products.sql;
\i share/users.sql;
\i share/orders.sql;
\i share/reviews.sql;
Explore the Retail Analytics database
How are users signing up for my site?
yb_demo=# SELECT DISTINCT(source) FROM users;
source
-----------
Facebook
Twitter
Organic
Affiliate
Google
(5 rows)
What is the most effective channel for user signups?
yb_demo=# SELECT source, count(*) AS num_user_signups
FROM users
GROUP BY source
ORDER BY num_user_signups DESC;
source | num_user_signups
-----------+------------------
Facebook | 512
Affiliate | 506
Google | 503
Twitter | 495
Organic | 484
(5 rows)
What are the most effective channels for product sales by revenue?
yb_demo=# SELECT source, ROUND(SUM(orders.total)) AS total_sales
FROM users, orders WHERE users.id=orders.user_id
GROUP BY source
ORDER BY total_sales DESC;
source | total_sales
-----------+-------------
Facebook | 333454
Google | 325184
Organic | 319637
Twitter | 319449
Affiliate | 297605
(5 rows)
What is the min, max and average price of products in the store?
yb_demo=# SELECT MIN(price), MAX(price), AVG(price) FROM products;
min | max | avg
------------------+------------------+------------------
15.6919436739704 | 98.8193368436819 | 55.7463996679207
(1 row)
What percentage of the total sales is from the Facebook channel?
You can do this as shown below.
yb_demo=# CREATE VIEW channel AS
(SELECT source, ROUND(SUM(orders.total)) AS total_sales
FROM users, orders
WHERE users.id=orders.user_id
GROUP BY source
ORDER BY total_sales DESC);
Now that the view is created, you can see it in our list of relations.
yb_demo=# \d
List of relations
Schema | Name | Type | Owner
--------+----------+-------+----------
public | channel | view | postgres
public | orders | table | postgres
public | products | table | postgres
public | reviews | table | postgres
public | users | table | postgres
(5 rows)
yb_demo=# SELECT source, total_sales * 100.0 / (SELECT SUM(total_sales) FROM channel) AS percent_sales
FROM channel WHERE source='Facebook';
source | percent_sales
----------+------------------
Facebook | 20.9018954710909
(1 row)