The previous article focused on installing dbt in the Mage.ai environment or independently, followed by the initialization of a project named mage_dbt – Dbt Installation (pip/conda) and project initialization. So, we have the mage-ai environment installed, into which we have installed dbt-sqlserver. We then tested that we can see the established file structure of the project. Today, we will look at how to perform configuration and setup of Dbt (including test using dbt debug command).
I tried the setup on OS Windows via Parallels (see the guide) and on a Macbook M1. Unfortunately, on the Macbook, I got stuck on the fact that I couldn’t install the correct ODBC drivers (17,18), and therefore my dbt debug (connection test) kept failing, and after about an hour of trying, I gave up. Everything was fine on Windows.
Dbt file structure
Before we dive into setting up dbt and connecting to the database, let’s look at how dbt is structured. We’ll use the Mage.ai webserver that we created in the first article.
We will focus only on the dbt directory. Other folders are related to mage.ai. We see that in the dbt folder, we have initialized the mage_dbt project and this project contains additional folders, especially:
- dbt_project.yml: This file contains the main configuration of your DBT project, including database connections, profile configurations, and other basic settings. It is a key file for managing the entire project.
- profiles.yml: The file contains configurations for connecting to different databases and environments. It is important for defining connections to databases that will be used in the project.
- models: This folder contains SQL files defining DBT models. Models are the core of our project and process data.
- seeds: This folder can contain input data or files that will be processed by your DBT models. It is a basic source of data for your project.
- others: not significant for now
Today, we are particularly interested in the profiles.yml and dbt_project.yml files, where we will set up our SQL Server database running on localhost. For now, we will not configure the development (dev) and production (prod) environments, I will leave that for another article.
Dbt debug – database setup (profiles.yml) and git
a) Open a terminal or command line, activate our environment (in my case, mage-ai), and enter the path to our dbt project
b) Now, let’s try running a command that tests the configuration of our environment. This is done via dbt debug (for more information, see the documentation). We see that the result is an error – but that’s okay because we have not configured the files yet.
Installation of git for dbt
From the error screenshot above, we see that one of the fail checks is git. So download and install git. After installation, go to the command line and type “git –version”. If the result is the git version, then everything is OK, but if the result is the error below, then we need to fix it and refer to the troubleshooting chapter:
(mage-ai) C:\Users\janzednicek\OneDrive\Prace\Jan_Zednicek\mage-docker\mage-ai\dbt\mage_dbt>git –version
‘git’ is not recognized as an internal or external command,
operable program or batch file.
Troubleshooting – Instalace git selhala – ‘git’ is not recognized as an internal or external command
If the command line does not return the git version, the problem is probably either in a bad installation or the environment does not know the path to the git executable.
In my case, the mage-ai environment does not know the path to git installed on C:\Program Files\Git. The solution is to add to the folder with the environment (in my case C:\Users\janzednicek\anaconda3\envs\mage-ai\etc\conda\activate.d) an activate.bat file that sets the path. So we create an “activate.bat” file and insert:
set "PATH=%PATH%;C:\Program Files\Git\bin"
. Then the console should return the git version.
If we run dbt debug, we see that we managed to eliminate one error
Configuration of profiles.yml and setting the path to the file
We still need to set the configuration files for the database.
Step 1 – location of the profiles.yml file – dbt expects after installation that the profiles.yml file will be stored in the location C:\Users(your username).dbt but you don’t have to have it like that
So at the first step, you have a choice
- either create a .dbt folder in that location, or
- set another exception in our activate.bat file from the previous step
I would like to have the profiles.yml file stored in the file with my mage.ai project. So I’m adding another line to activate.bat: set "DBT_PROFILES_DIR=C:\Users\janzednicek\OneDrive\Prace\Jan_Zednicek\mage-docker\mage-ai\dbt"
Step 2 – configuration of profiles.yml – for me, the target database will be SQL server and I will be logging in using SQL authentication, so I edit the file and fill it in as below.
If you are interested in what configuration options can be done through dbt, here is a link to the documentation. You will find there how to connect via windows auth or AAD, for example.
Step 3 – configuration of dbt_project.yml – there is not much to set up here, it is important to ensure that this file refers to the correct profile from the previous step. In my case, to mage_dbt.
Testing dbt after installation and the first launch of dbt run
I hope you were lucky and we all got to the point where the dbt debug command returns everything green:
When we look into the models folder, we have 2 test scripts that we could immediately run as part of dbt and thus test the whole solution
We start the Dbt project using the command “dbt run” – if we have development and production environments, we add the –target parameter to tell dbt which environment to run over. But we only have one environment, so it doesn’t matter.
Everything looks fine, and the data has arrived in the database.