For generating IDs (primary keys) in application databases or surrogate keys in data warehouses, there are several approaches. GUIDs or numeric IDs are mostly used. GUIDs are mainly used in application databases and are generated, for example, using the NEWID() function.

This has its advantages and disadvantages. The biggest disadvantage is poor SQL query performance because generating GUIDs and, in general, table joins are slower compared to numerical data types. The solution can be the use of a sequence – ID generator.

If we want to generate numerical IDs, we have several ways to do it:

  • Creating a table column as IDENTITY(1,1)
  • Generating IDs using some function, e.g., ROW_NUMBER()
  • Generating IDs using SEQUENCE

Sequences are one of the ways to generate IDs in a database table. This article will focus on how to create and use sequences. We will also look at how sequences perform compared to other methods of generating numerical keys.

Syntax for Creating a SQL Sequence

When creating a sequence, you specify the name, data type, the number from which the sequence starts, increment, minimum/maximum sequence value, and optionally CACHE with a specified size (or NO CACHE).

CREATE SEQUENCE [dbo].[Sequence_Name] AS [INT] START WITH 1 INCREMENT BY 1 MINVALUE 1 MAXVALUE 100000 CACHE 1000

Calling a Sequence and Getting the Next Value

After creating a sequence, you can call it to retrieve sequential values from the sequence.

SELECT NEXT VALUE FOR [dbo].[Sequence_Name];

The first query will return 1 (the sequence starts from 1). The second SQL query would return 2 (1 has already been served, and the increment is 1), and so on. If we reach the MAXVALUE limit of 100000, we will get an error message: “The sequence object ‘Sequence_Name’ has reached its minimum or maximum value. Restart the sequence object to allow new values to be generated“.

Sequences can be called in combination with the FROM clause, where we can number existing records in a table. At the same time, we can consider sorting when generating (in this case, numbering will start from the lowest date).

SELECT   NEXT VALUE FOR [dbo].[Sequence_Name] OVER (ORDER BY [Datum]) AS [ID]   ,[Column] FROM [Tabulka]

Restarting a SQL Sequence

You can restart a sequence either entirely or set it back to a specific value.

Restarting the entire Sequence:

ALTER SEQUENCE [dbo].[Sequence_Name] RESTART;
SELECT NEXT VALUE FOR [dbo].[Sequence_Name];

The result will be 1

Setting a new starting value:

ALTER SEQUENCE [dbo].[Sequence_Name] RESTART WITH 151;
SELECT NEXT VALUE FOR [dbo].[Sequence_Name];

The result will be 151

Comparison of SEQUENCE, IDENTITY, and ROW_NUMBER(), CACHE Performance Problem

Below, you can see the performance of different methods of ID generation. I tested ID generation into a table containing 400,000 records (INT numbers), and the IDs were assigned after sorting by ORDER BY. The result is in seconds, and it is a very rough test. Results may vary depending on the size and complexity of the tables. However, it is sufficient for demonstrating what I want to show.

Insights:

  1. According to the results, generating IDs using ROW_NUMBER OVER (ORDER BY) is the fastest. However, I would like to note that this method may not perform well with large tables. On the other hand, it works well with very simple tables or queries where complex ORDER BY operations are not used.
  2. IDENTITY will generally perform well everywhere, especially if you don’t need to worry about key generation performance.
  3. With SEQUENCE, you can gain a lot, but also lose a lot. How SEQUENCE performs depends on the CACHE Size setting. It can be seen that Cache is mandatory when using sequences, especially when numbering a larger number of records within a single transaction.
  4. CACHE needs to be set with a Size argument (and the size needs to be appropriately determined) because if CACHE is created without it, the sequence behaves as if CACHE were not set at all.

In conclusion, SEQUENCE is a great way to generate keys when using CACHE, and it is necessary to set CACHE with an adequate Size. This is especially true when you want to generate keys for a large number of records at once.

5/5 - (1 vote)

Ing. Jan Zedníček - Data Engineer & Controlling

My name is Jan Zedníček and I have been working as a freelancer for many companies for more than 10 years. I used to work as a financial controller, analyst and manager at many different companies in field of banking and manufacturing. When I am not at work, I like playing volleyball, chess, doing a workout in the gym.

🔥 If you found this article helpful, please share it or mention me on your website

Leave a Reply

Your email address will not be published. Required fields are marked *