In Keboola, components are blocks that make up the Data Flow (pipeline). They are elements in the ETL/ELT process used to connect to data sources (source) or data destinations (destination). Components enable us to transfer data. Thanks to democratization tools like Keboola, there is no need to know how to program because Keboola components are already prepared in the application; you only need to fill login credentials.
In the previous article, I introduced Keboola and showed what the application looks like – ETL | Keboola Free (Free) – Project Creation, Basics. Before we start building our first Flow in Keboola, let’s take a closer look at the components that make up a Flow.
Keboola Components – Introduction, Free Plan Limitations
Components in Keboola can be categorized into three:
- Source components – They connect to the source system, read data based on component settings, and pass it further in the pipeline.
- Destination components – They take data from the previous block (e.g., source component or a transformation block) and store it in the destination system (e.g., data warehouse or storage).
- Application components – These are special types of components that perform various activities. These components are either created by Keboola or can be created by third parties.
You can find connectors in the “Components” section, where you can see a list of created and configured components. These components are used in Flows and can be modularly used in various flows as needed. You can find documentation for Keboola components here.
The number of components available to paying customers is higher than in the free plan. In the Keboola free plan, there are over 200 different connectors available (as of February 6, 2024), which is sufficient for the vast majority of scenarios.
List of Components in Keboola, Categorization, and Count
When you click on “Add New Component” in Keboola’s component overview, you will enter an environment where you can view connectors and find the one you need.
The selection structure I see in the Free plan is as follows:
-
- Data Source (172)
- Data Destination (53)
- Application (36)
B) Filtering by category
-
- API (11)
- Accounting (5)
- Advertisement (17)
- Analytics (14)
- CRM (17)
- Data Visualization (8)
- Database (34)
- E-commerce (4)
- ERP (10)
- File Storage (23)
- Marketing (23)
- Monitoring (8)
- Project Management (5)
- Social (7)
It’s a bit disappointing that when you add up the number of components by category, it reaches 186 and not 200+ as declared on the pricing page for the free plan (as of February 6, 2024). Additionally, it’s essential to consider that source and destination components for the same system (e.g., SQL Server database) are counted as two. This means that the total number of systems (regardless of whether it’s a source or destination) is even lower than 186.
However, it’s essential to note that you can find connectors for services that most of us use today.
Keboola Components – Configuration and Authentication
To make a component work correctly within a Flow, it needs to be configured. This involves defining under what identity and account we are connecting to the component. The configuration varies depending on the authentication method used by the specific system.
For example, in the case of SQL server databases, the configuration looks like this:
We configure standard login credentials, and it’s advisable (and we should) establish a secure connection through an SSH tunnel, which requires additional settings (certificate, SSH keys).
In the case of cloud databases and storage (e.g., Google Drive) or anything else, we have several authentication options:
A) Instant (online authorization) – Authorization via a browser.
B) External authorization (via link) – A method where an authorization link is generated via an authorization application, allowing the user/application to authorize for a limited time (48 hours).
C) OAuth 2.0 via client and secret – In this method, we generate a login pair of client and secret (password) in the application we want to connect to.
3) Interaction with Keboola Storage – Another configuration method is required when communicating within a Flow with Keboola storage. In this case, the component requires an access token created at the Keboola application level. Through this token, the component can access Keboola storage and read or write to a bucket (folder/container for files) as needed within the flow.
In the screenshot below, we have created a token in the project settings that has read access to the GA_janzednicekcz bucket containing data downloaded from my Google Analytics account.
There are many components in Keboola, but the authentication methods generally repeat with some differences among the three mentioned above.
Always remember that security takes precedence over speed, so try to choose the most secure authentication option, especially when dealing with sensitive data.