Fabric | dbt – Architecture and the Role of dbt in the Medallion Architecture

Microsoft Fabric 1 represents a unified SaaS platform that integrates components of the so-called modern data warehouse. Within a single platform, it is possible to handle storage through artifacts (Lakehouse/DWH), computing resources (Spark/Polaris), and tools for data flow orchestration. The Fabric architecture also provides tools for the transformation layer (e.g., Spark), which can be written and then orchestrated directly within Fabric itself.

Sometimes, however, we don’t want to get locked into a vendor-specific tool and prefer a modular solution. Critical parts such as transformations and source data should be kept externally (ADLS, dbt) to allow easier migration elsewhere (for example, if pricing becomes an issue). In this sense, dbt offers a number of advantages (not only because it’s open source), and within my series of articles about Fabric, I’ve chosen this tool specifically for the silver and gold medallions. Therefore, I’ll explain it a bit more in detail.

The Role of dbt in Fabric

dbt (Data Build Tool) serves as the transformation core in the ELT model (Extract, Load, Transform). Instead of moving data (which is the job of Data Factory), dbt focuses exclusively on the T – Transform part. dbt applies data engineering principles—versioning, modularization, and testing. This ensures the robustness of the entire transformation process and its easy maintenance.

In my project, we use dbt to manage data flow through the Medallion Architecture (Bronze, Silver, Gold). Bronze in Fabric is represented as a Lakehouse with shortcuts to Azure Data Lake Storage Gen 2 2 parquet files, followed by materialization into Delta tables. These Delta tables in the Lakehouse serve as our data source, and dbt then takes over for the logic and computation of the silver and gold layers.

I’ve previously written about dbt in the dbt category – there’s a basic introduction on how to set it up, configure connections, and more.

Medallion Architecture: Logical Data Separation

The Medallion pattern is a recommended design standard for Lakehouse systems. It ensures atomicity, consistency, isolation, and durability (ACID) of data at all levels—these are the fundamental prerequisites for a functional data solution. Each layer has a clearly defined responsibility and minimum data quality requirements.

Layer Purpose and Format Role of dbt and Materialization Key Quality Principles
BRONZE (Raw data) Raw data, Parquet / Delta dbt Source Immutability, Append-Only. Ensuring metadata and freshness.
SILVER (Validated, deduplicated) Cleaned, Delta tables View or Table / Incremental Deduplication, Standardization, SCD Type 2 (history tracking).
GOLD (Curated, modeled) Aggregated, Star Schema Table / Incremental Business Logic, KPIs, optimization for BI reporting.

Small note: these are just new words for something that has existed for a long time – for experienced data engineers, it’s nothing more than fancy buzzwords for a) Landing (Bronze), b) Staging (Silver), c) Semantic model/Data mart (Gold).

dbt as the Transformation Engine in Fabric

For efficient execution of dbt transformation models, it’s essential to understand the interaction between dbt and the Fabric compute environment.

Connectivity, Performance, and Security (SQL Endpoint)

  • dbt Adapter: To connect to Fabric, the dbt-fabric adapter (derived from dbt-spark) is used. dbt doesn’t connect directly to a Spark cluster but instead uses the Lakehouse SQL Endpoint to send SQL queries.
  • Polaris Engine: SQL queries generated by dbt are processed by the Polaris engine (a high-performance engine under Fabric). This architecture allows dbt to execute robust operations such as CREATE TABLE AS SELECT or complex SQL statements optimized for the Delta format.
  • SPN Authentication: As we’ll see in later parts, our production environment requires authentication using a Service Principal (SPN) instead of a username and password. This is critical for CI/CD pipelines and security. It will be configured in the dbt configuration files.

Modularization and DAG (Directed Acyclic Graph)

The core methodology and biggest strength of dbt is the modularization of transformations. Each transformation step within the Medallion Architecture is an individual SQL file (model).

  • Dependencies: Dependencies between models (e.g., a Silver model depending on a Bronze model, or interdependencies within Gold tables) are defined using the JINJA macro {{ ref(‘model_name’) }} as shown in the image below.
  • Building the DAG: dbt automatically analyzes these references and builds a DAG – a directed acyclic graph. When executed (dbt run), dbt ensures that models are run in the correct topological order. This removes the need to write complex data flow logic as in traditional ETL tools. This way, we can run the entire model and all dependencies with a single terminal command, and dependencies resolve automatically. We just need to monitor the number of cores allowed for dbt to use for parallelization on the target compute engine—configured in the settings.

dbg-dag

Managing Incrementality and History Tracking

One of the biggest technical benefits of dbt in Fabric OneLake is its efficient handling of data changes:

  • Incremental Models: In the Silver and Gold layers, dbt allows defining models as materialized=’incremental’. Here dbt uses the logic of the is_incremental() macro to limit reads from the source to only new or changed data. This drastically reduces compute time and Fabric Capacity (CU) consumption.
  • History Tracking (SCD Type 2): To track changes in key dimensions (e.g., customer, product), dbt uses the concept of Snapshots. The Snapshot functionality generates tables that record a validity interval (dbt_valid_from and dbt_valid_to) for each record, which forms the core of the SCD Type 2 implementation for dimension tables. For large fact tables, we’ll use a different approach, since the snapshot method copies entire source Delta tables, which we want to avoid.

Conclusion

The Medallion Architecture combined with the dbt methodology provides a robust foundation for our project working with the sample data AdventureWorks. By using Fabric as a powerful compute and storage backend and dbt as an agile transformation frontend tool, we ensure that our data is reliable, testable, and ready for business reporting.

Rate this post

Reference

  1. Microsoft documentation, Microsoft Fabric [online]. [cited 2025-10-21]. Available from WWW: https://www.microsoft.com/en-us/microsoft-fabric
  2. Microsoft documentation, Azure Data Lake Storage Gen2 [online]. [cited 2025-10-21]. Available from WWW: https://learn.microsoft.com/cs-cz/power-query/connectors/data-lake-storage
Category: Dbt - Data Build Tool Fabric

About Ing. Jan Zedníček - Data Engineer & Controlling

My name is Jan Zednicek, and I have been working as a freelance Data Engineer for roughly 10 years. During this time, I have been publishing case studies and technical guides on this website, targeting professionals, students, and enthusiasts interested in Data Engineering particularly on Microsoft technologies as well as corporate finance and reporting solutions. 🔥 If you found this article helpful, please share it or mention me on your website or Community forum

Leave a Reply

Your email address will not be published. Required fields are marked *