Dataset

https://www.kaggle.com/datasets/tunguz/online-retail

Column Description
InvoiceNo Invoice number. Nominal, a 6-digit integral number uniquely assigned to each transaction. If this code starts with letter 'c', it indicates a cancellation.
StockCode Product (item) code. Nominal, a 5-digit integral number uniquely assigned to each distinct product.
Description Product (item) name. Nominal.
Quantity The quantities of each product (item) per transaction. Numeric.
InvoiceDate Invice Date and time. Numeric, the day and time when each transaction was generated.
UnitPrice Unit price. Numeric, Product price per unit in sterling.
CustomerID Customer number. Nominal, a 5-digit integral number uniquely assigned to each customer.
Country Country name. Nominal, the name of the country where each customer resides.

Pipeline

Screenshot 2023-07-13 at 16.41.19.png

Data modeling

Screenshot 2023-07-13 at 16.59.35.png

Pipeline

Learn Airflow with this 80% off coupon: https://www.udemy.com/course/the-complete-hands-on-course-to-master-apache-airflow/?couponCode=AIRFLOWRETAIL

Prerequisites

Steps


IMPORTANT!

Open the Dockerfile and make sure you use quay.io/astronomer/astro-runtime:8.8.0 in the Dockerfile (or airflow 2.6.1), If not, use that version and restart Airflow (astro dev restart with the Astro CLI)