SQL Server is not just a platform for storing and working with data or databases. It is a suite of services that includes features such as:
- Reporting Services (SSRS) – report creation and storage, including a report server.
- Analysis Services (SSAS) – OLAP cubes.
- Master Data Services (MDS) – management of important data, such as various code lists.
- Data Quality Services (DQS) – includes tools for checking and correcting inconsistent data before storing it.
- Integration Services (SSIS) – a tool that allows the creation of data pumps for transferring data between systems/platforms using SQL Server Data Tools. These data pumps can be saved as packages and deployed on SQL Server, where they can be run automatically (e.g., via SQL Server Agent).
This article focuses on the last mentioned service, SSIS, and addresses the issue of how to set up an automated process that will download data from Excel to a SQL Server data warehouse every day.
What Is Connection Manager in Integration Services (SSIS)?
To move data from point A (e.g., a database on a server or an Excel file) to point B (the target SQL database), you first need to create a connection. This allows us to transfer data.
The connection manager area, which contains a list of defined connections to various sources or target destinations, is shown in the image below. Since this is a new project and SSIS package, no connections have been defined yet.
After right-clicking in the connection manager area, you need to select the type of data source to connect to.
The first five items in the menu include the most commonly used connection types, and selecting “New Connection” will display all possible connection types.
Typically, you will want to connect to a SQL Server database (or another platform) most frequently using OLE DB Connection, or to a text file containing data separated by a delimiter using File connection. Other options include connecting to Excel using Excel connection manager and other connection types (FTP, ODBC, SMTP, HTTP, WMI connection managers).
Example – Creating a Connection Manager for an Excel File and Regularly Downloading Data to a SQL Server Database
Let’s assume you have an Excel file located at C:\Source_Data\Source_data.xls. This file contains some data that you want to download into a SQL table every day because the source Excel file is updated multiple times a day. Before downloading data, you will always clear the target table using a SQL TRUNCATE TABLE statement. The Excel file looks like this.
First, create a table on SQL Server, for example:
CREATE TABLE Temp.dbo.Priklad (
Datum DATE NULL,
Hodnota INT NULL
2.a) From the menu above the connection managers area, select “New Connection” and then “EXCEL.”
2.b) Next, set the path to the source Excel file, and a new connection type will appear in the list.
The connection to the source file is now ready, and in a similar way, you will set up a connection to the target SQL Server database, “Temp.” This is done in the same manner, but this time you will select “OLE DB connection” as the connection type and configure the server and database.
Now it’s time to design the data flow. SSIS is highly modular and consists of individual steps. Since we want to clear the target table before loading data from Excel, the first step will be an “Execute SQL Task.” From the SSIS toolbox, select the corresponding task and drag it into the control flow area.
After dragging, you will see a red “X,” indicating that something is not configured correctly. This is because the task does not yet have a connection to the connection manager. You can configure it by editing the task and specifying the Connection Manager and the SQL statement to execute on the database, which in our case is
TRUNCATE TABLE Temp.dbo.Example
Click on the data flow task and configure it as shown below. From the SSIS toolbox, select “Excel Source” and “OLE DB Destination” as the source and destination, respectively. Configure the components to use the connection managers and connect them with arrows. You’re done.
When you execute the entire package, you should see the following:
…if you check the target table, you should find the data from the Excel file:
The data pump can be executed multiple times, as it always clears the data at the beginning. To run this prepared SSIS package automatically, you have two options:
- Save the package and the .dtsx file somewhere on the server.
- Deploy the entire package to the server’s catalog using the “deploy” command.
Then, create a job in SQL Server Agent and set the “Step Type” to “SQL Server Integration Services package” with the path to the .dtsx file. In the last step of the job, specify the frequency and time for execution, and you no longer need to worry about it.