• 20. 1. 2024
  • Ing. Jan Zedníček - Data Engineer & Controlling
  • 0

As a business intelligence developer experimenting with ChatGPT, I explore various scenarios that could assist me in my work. These scenarios certainly include assistance in writing code (such as in Python), reverse engineering code, detailed planning of the breakdown of an epic/story/project into JIRA, or perhaps financial analysis of a company? Yes, for these scenarios, I often use ChatGPT. Today, I would like to test how ChatGPT handles the design of a data warehouse. I would approach the application in a similar manner.

ChatGPT Data Warehouse or Database Model

My expectation would not be to receive a finished model, but if I were to use ChatGPT for this scenario, I would rather aim to save time. This means I expect some usable draft of the model, which I can then develop further. For example, in the case of a date dimension, I can have a script created to calculate attributes.

So, the assignment will be to create a simple data model in a star schema with 1 fact table and several dimension tables (facts and dimensions).

  • The fact table for us will be a table with orders
  • Dimensions for us will be
    • Order date
    • Product
    • Order status
    • Customer

We also want the model to be designed so that we use the slowly changing dimension type 2 approach (SCD2) – i.e., attribute historization.

Thus, the prompt looks like this:

chat-gpt-datovy-model

It answered something like this:

  • To design a data warehouse model with the specified specifications, we will follow the star schema approach. The star schema is a popular approach to data modeling used in data warehouses, where a central fact table is connected to various dimension tables. Given that you have specified the slowly changing dimension type 2 (SCD2) approach for dimension tables, we will include columns for tracking historical changes.

Here are the SQL scripts to create the tables:

CREATE TABLE F_Orders (
OrderID INT IDENTITY(1,1) PRIMARY KEY,
DateKey INT NOT NULL,
CustomerKey INT NOT NULL,
ProductKey INT NOT NULL,
OrderStatusKey INT NOT NULL,
Quantity INT,
TotalAmount DECIMAL(12, 2),
— Additional order-related columns
— Foreign keys will be defined after the creation of dimension tables
);

CREATE TABLE D_Date (
DateKey INT IDENTITY(1,1) PRIMARY KEY,
Date DATE NOT NULL,
Year SMALLINT,
Quarter SMALLINT,
Month SMALLINT,
Day SMALLINT,
WeekOfYear SMALLINT,
— Additional date-related columns
);

CREATE TABLE D_Customer (
CustomerKey INT IDENTITY(1,1) PRIMARY KEY,
CustomerID INT,
FirstName NVARCHAR(50),
LastName NVARCHAR(50),
Email NVARCHAR(100),
Phone NVARCHAR(20),
ValidFrom DATETIME,
ValidTo DATETIME,
IsCurrent BIT,
— Additional customer-related columns
);

CREATE TABLE D_Product (
ProductKey INT IDENTITY(1,1) PRIMARY KEY,
ProductID INT,
ProductName NVARCHAR(100),
Category NVARCHAR(50),
Price DECIMAL(10,2),
ValidFrom DATETIME,
ValidTo DATETIME,
IsCurrent BIT,
— Additional product-related columns
);

CREATE TABLE D_OrderStatus (
OrderStatusKey INT IDENTITY(1,1) PRIMARY KEY,
StatusDescription NVARCHAR(100)
— Additional status-related columns
);

OK, that doesn’t look too bad. The tables contain an artificial primary key. They also contain, as is proper and customary, a business key (from the primary system) – which we did not even define in the prompt. The fact table contains surrogate keys. The dimensions in the design contain columns through which we can work with slowly changing dimensions.

Let’s also have the procedure for calculating the date dimension printed:

chatgpt-napocet-datumove-dimenze

Based on this test, I think ChatGPT may have significant potential for tasks of this type.

Rate this post

Ing. Jan Zedníček - Data Engineer & Controlling

My name is Jan Zedníček and I have been working as a freelancer for many companies for more than 10 years. I used to work as a financial controller, analyst and manager at many different companies in field of banking and manufacturing. When I am not at work, I like playing volleyball, chess, doing a workout in the gym.

🔥 If you found this article helpful, please share it or mention me on your website

Leave a Reply

Your email address will not be published. Required fields are marked *