Category Archives: Keboola Tutorials and information

Keboola belongs to cloud services provided as SaaS – specifically it is a data platform for data pipelines and storage. In the past, data extractions, transformations, and imports – abbreviated as ETL processes – were the domain of IT or BI specialists because they required knowledge of software architecture, programming, and the right tools. In Keboola, data flows can be created by even a person without technical knowledge.

Introduction to Keboola

Keboola is a cloud service, so you don’t need to install anything. The service operates on a Freemium model, where you can use Keboola for free with certain limitations that won’t significantly restrict you for testing or smaller projects. The most important limits are (1) 120 minutes of computational time and (2) 250GB of storage. Details can be found on the Keboola pricing page.

I recommend these resources as a starting point

Keboola Guides and Information on This Website

On this website, you will find several articles and guides about Keboola that dive into selected topics in detail. To try out the tutorials practically, you only need a Keboola account. The AdventureWorks database (sample data) is available online – Big thanks to sqlservercentral.com.

Comparison of Keboola Freemium with Fivetran Free

Keboola could be compared, among the more well-known tools, to Fivetran – Challenger from the Gartner report of 2023. Both tools are good, and I would say that one is not significantly better than the other.

Overall Evaluation of Keboola vs. Fivetran

Below are comparison criteria that came to mind. It is evident that overall, it’s roughly a tie, and the choice between the two tools depends on specific current and future conditions in your data environment. Depending on what I expect from the tool, I would choose as follows (I have elaborated on this in more detail in the next chapter):

Fivetran: If you have a data warehouse or data lake on-premises or in the cloud and you actually need only to get data from primary systems to you, and then you perform transformations and orchestrations on your own, perhaps via dbt, airflow, or otherwise.

Keboola: If you are looking for an ETL/ELT tool and the Snowflake ecosystem rather than an ETL pipeline. I would also choose Keboola if I expect better orchestration capabilities from the service, the ability to perform transformations directly in Keboola, and the ability to code these transformations in SQL/Python workspace. The ability to write custom components and much more.

Detailed Evaluation of Keboola vs. Fivetran

There are, of course, many differences between Fivetran and Keboola; I will attempt to quantify the most significant ones. This is, of course, my subjective opinion.

  • (Draw) Easy to use – Both platforms are user-friendly, setting up data components is straightforward.
  • (Draw) Support – User support is, I would say, at a high level for both.
  • (Draw) Security – Both platforms offer secure connections and various authorization options for cloud services through built-in connectors (OAuth, SSH, certificates, tokens).
  • (Draw) Both platforms offer wide options for incremental identification and choices for storage mode (replace, increment, etc.) and table metadata settings.
  • (Loss) Keboola has pricing based on usage time, whereas Fivetran operates on MAR (monthly active rows).
    • In the Keboola Freemium model, you have 60 minutes available, which you consume relatively quickly because Keboola has quite significant overheads for processing. This Keboola model is quite logical because it extensively uses Snowflake as a backend for each component (computing costs).
    • In contrast, Fivetran does not care how long it runs and charges you by rows – you get 500,000 rows/month for free. So, quite often, you have ELT tools for smaller projects for free.
  • (Loss) In my testing, Keboola is slower in processing data for small tables, which, combined with pricing based on usage time and a larger number of smaller tables, may result in increased costs. I tested about 8 tables with a total of about 20,000 rows, and the flow from an SQL server database to Keboola storage takes about 4-5 minutes. That seems like a lot to me. On the other hand, it’s certainly not true that the relationship between the number of records and the runtime is linear, so I recommend testing it on your scenario.
  • (Win) Keboola is an ETL/Data platform ecosystem coexisting with Snowflake and many other platforms.
    • Keboola, therefore, offers versatility for a broader range of users – tools for both laymen (non-technical users) and professionals (Python, API, R, etc.).
    • Fivetran is more oriented as a flow heater – it takes Data Source – Destination, delivers the data, and that’s it. So, it is somewhat limited in versatility, but on the other hand, it has a perfectly mastered ELT part, and everything is lightning-fast.
  • (Win) Orchestration Features Keboola has certain orchestration capabilities. I certainly don’t want to compare Keboola to tools like Airflow, Dagster, Mage.ai, and the like, but Keboola can handle most scenarios requiring flow orchestration at the application level. Moreover, within Keboola flows, we are not limited only to components in the application, but we can do things like:
    • Call SQL Server procedures (typically we want to calculate the semantic layer after downloading raw data)
    • Refresh Power BI reports (we want to refresh reports after calculating the semantic layer and datasets)
    • Trigger something via API
    • and more

ETL | Keboola – Flow Transformation (Snowflake), Power BI Report Refresh

Last time, we did a deeper dive into how Keboola Storage works, how applications process it, and how it stores data during a flow. Today, I’ll show you how transformations work in Keboola. We’ll create a much more complex Flow that calculates data from a source, utilizes Keboola transformation and then updates a Power BI… Read More »

ETL | Keboola Free – Components, Types, Security

In Keboola, components are blocks that make up the Data Flow (pipeline). They are elements in the ETL/ELT process used to connect to data sources (source) or data destinations (destination). Components enable us to transfer data. Thanks to democratization tools like Keboola, there is no need to know how to program because Keboola components are… Read More »

ETL | Keboola – Data Flow Tutorial – from SQL Server to Google drive

In the previous part titled ETL | Keboola Free – Creating a Project, Basics, we did a basic introduction to the application. We went through the process of creating a project and also briefly explored the Keboola structure and menu. Today, I’d like to show you how easy it is to create a Keboola flow.… Read More »

ETL | Keboola Free – Project Creation, Basics, Orientation

In the previous article, we introduced ourselves briefly – ETL | Keboola – Introduction, Pricing, Products – An Alternative to Fivetran. We know that Keboola offers a Freemium model, so we can try this tool for free. We just have to tolerate some limitations when it comes to usage (the limit of minutes is 120).… Read More »

ETL | Keboola – Intro, Pricing, Products – Alternative to Fivetran

As part of data democratization and the transformation towards data-driven management, more and more companies are adopting solutions that support and strengthen this culture. Data is made accessible to a wide range of users for analysis and reporting in tools like Power BI. At the ETL level, through which we perform data integrations into a… Read More »