In applications or complex enterprise information architectures, there is often a need to handle historical records and track changes in values for various reasons. Most of the time, it’s for security purposes; we need to monitor who is making changes and be able to correct errors, whether they are from users or database administrators. Another good reason for implementing CDC (Change Data Capture) is when we need to later integrate data from the application into a data warehouse or another application and need to identify the changed records for migration. With the Change Data Capture (CDC) feature, we can set up change tracking for a specific table in the database.

Setting up Change Data Capture (CDC) – Tracking Changes in a Table

We will configure CDC in the AdventureworksDW2016CTP3 database (a sample database from Microsoft) in a few steps. We want to track all changes in the DimCustomer table.

The first step is to enable the change data capture feature for the database:

(1.a) In the sys.databases system table, there is an indicator to check which databases have the CDC feature enabled/disabled. In this case, CDC is disabled everywhere.

change data capture nad databazi

(1.b) Enabling CDC for the database is done using the system procedure “sys.sp_cdc_enable_db“.

No, you see, I made my first MISTAKE. CDC does not work with the Express edition, which is an important piece of information I didn’t mention at the beginning. So, I need to switch to the Developer Edition :)))

This instance of SQL Server is the Express Edition (64-bit). Change data capture is only available in the Enterprise, Developer, Enterprise Evaluation, and Standard editions.

Error CDC

CDC will work with Developer Edition or higher editions, but not with Express Edition.

Zapnutí Change data capture (CDC) nad databází

(1.c) After enabling CDC for the database, a new schema “cdc” and several system tables for logging changes are created.

Change data capture systémové tabulky

Enabling Change Data Capture for a Table

(2.a) To track changes in the table, we enable change data capture using the system procedure named “sys.sp_cdc_enable_table”

USE AdventureworksDW2016CTP3
GO
EXEC sys.sp_cdc_enable_table
@source_schema = N‘dbo’, –schema name
@source_name   = N‘DimCustomer’, –table name
@role_name     = NULL
GO
Zapnutí Change data capture (CDC) nad tabulkou
(2.b) After enabling CDC for the DimCustomer table:
  • Two jobs are created to monitor changes.
  • A new table named dbo_DimCustomer_CT is created in the system tables, which will contain the complete history of changes in that table. This table includes a complete list of columns and several additional system columns used to work with the history of the table. The most important ones are:
    • __$start_lsn – Transaction identifier within which the change was made. If multiple changes occur within a transaction, the change records will all have the same __$start_lsn.
    • __$operation – The operation that occurred.
      • DELETE = 1
      • INSERT = 2
      • Value before UPDATE = 3
      • Value after UPDATE = 4

change data capture sledování změn

Extracting Information from Change Data Capture

We make some changes in the table where CDC is applied:

  • 3x UPDATE
  • 1x DELETE

Simulace změn v tabulce pod change data capture

How does the dbo_DimCustomer_CT table with change tracking look? (I’ve selected only the columns that I changed. You can see that we have a complete history of what happened. From the log, you can see that all UPDATE operations (first 6 records) were part of one transaction, and I performed the DELETE operation separately. For UPDATE operations, we have both the original and the new values in the log.

CDC záznam změn sloupců z logu

If you want a timestamp context, you can modify the script slightly. There is a mapping table in the system tables between __$start_lsn and the timestamp context.

CDC záznam z logu - časový kontext

The CDC feature has a lot of capabilities, more than I can describe in one article. Hopefully, I’ll continue this discussion sometime in the future.

Rate this post

Ing. Jan Zedníček - Data Engineer & Controlling

My name is Jan Zedníček and I have been working as a freelancer for many companies for more than 10 years. I used to work as a financial controller, analyst and manager at many different companies in field of banking and manufacturing. When I am not at work, I like playing volleyball, chess, doing a workout in the gym.

🔥 If you found this article helpful, please share it or mention me on your website

Leave a Reply

Your email address will not be published. Required fields are marked *