Category Archives: Dbt – Data Build Tool

dbt is an open-source data engineering tool that enables data transformation within data warehouse/datamart environments. It also supports historization processes (changes over a table) and orchestration — managing dependencies between scripts.

Latest articles from this category:


Example of a dbt GitHub Project – Data Warehouse/Datamart
I have a public project available on GitHub. You can find the project and its description on the repository page — Sample Data Warehouse | Datamart (Data Engineering project). It is a dbt project designed to transform source system data into a star schema of fact and dimension tables. SQL code can be found in the directory model/marts/sales.

Fabric | dbt – Configuration profiles.yml for SPN Authentization to SQL Endpoint

This article describes the process for configuring the profiles.yml file within the dbt tool to be able to properly connect to the Microsoft Fabric SQL Endpoint using Service Principal (SPN) authentication (we covered SPN in article Fabric | dbt – Azure Service Principal (SPN) and RBAC for dbt in Entra ID). The goal is to… Read More »

Fabric | dbt – Docker dbt and Azure Container Apps (CI/CD)

For the cloud-based Warehouse built on top of MS Fabric, we already have a prepared Lakehouse and DWH environment and, among other things, a configured dbt project. Now comes an important DataOps phase: we need to think about From which environment (ideally serverless) we will batch-run the dbt project in the future. How to implement… Read More »

Fabric | dbt – Creating a Fabric Lakehouse/Data Warehouse and Configuration

Microsoft Fabric is a unified data platform that connects various artifacts for developing data solutions, analytics, and BI in a single integrated environment. One of its key components is the Fabric Lakehouse, which combines the advantages of: Data Lake – scalability, low storage cost Data Warehouse – structured approach, SQL queries support This article describes… Read More »

Fabric | dbt – Architecture and the Role of dbt in the Medallion Architecture

Microsoft Fabric represents a unified SaaS platform that integrates components of the so-called modern data warehouse. Within a single platform, it is possible to handle storage through artifacts (Lakehouse/DWH), computing resources (Spark/Polaris), and tools for data flow orchestration. The Fabric architecture also provides tools for the transformation layer (e.g., Spark), which can be written and… Read More »

ETL | Dbt file structure and dbt_project.yml configuration

In Dbt (data build tools), files are organized in a logical structure based on files/folder organization and configuration (dbt_project.yml). To navigate the project effectively as the codebase grows, it’s perhaps a good idea to create a system within the files. File Structure of a Dbt Project After initializing an empty Dbt project, the directory structure… Read More »

ETL | Dbt core with Snowflake – Configuration and dbt debug

Is Dbt compatible with Snowflake? Definitely! And if you combine it with some ETL/orchestration tool like Keboola (cloud) or Mage.ai (on-premises), you’ve got yourself a decent data solution. Nowadays, most ETL frameworks (at least the better ones) integrate with dbt. Local Configuration of Dbt and Snowflake In this tutorial, we assume that dbt is installed… Read More »

ETL | Mage.ai – Dbt Installation (pip/conda) and project initialization

In the previous article – ETL | Mage.ai – Solid Alternative to Airflow – Intro and Installation we introduced the ETL tool Mage.ai as a lighter alternative to Apache Airflow. We demonstrated how to get the framework up and running through the terminal and learned that after installation, it runs on localhost:6790/. I promised in… Read More »

ETL | Mage.ai Docker Installation – dbtsqlserver – Dbt Debug Error, Fix

Today, I attempted to install Mage.ai via Docker as part of my familiarization with Mage.ai. This is currently (as of 2024-01-26) the only scenario for running Dbt together with Mage.ai within pipelines natively. Of course, there is a possibility to run dbt model using custom python code anyway (in case you use pip/conda installed mage)… Read More »

ETL | Dbt debug – Configuration and testing of SQL Server database (profiles.yml) – Windows

The previous article focused on installing dbt in the Mage.ai environment or independently, followed by the initialization of a project named mage_dbt – Dbt Installation (pip/conda) and project initialization. So, we have the mage-ai environment installed, into which we have installed dbt-sqlserver. We then tested that we can see the established file structure of the… Read More »