data engineering apis

ELT Versus ETL in the Era of Cloud Databases

by Interzoid Team


Posted on August 30th, 2022


Normalize your data

ELT is emerging as a preferred approach in the era of Cloud Databases versus traditional ETL. Here's why.

Over the decades, the term "ETL" had become the standard method to move data from operational systems to a data warehouse or equivalent for analysis. First, an IT organization-led team would design a process to "extract" data from its sources, then apply various rules to "transform" the data into the shape it needed to be per the specifications of a Data Warehouse. This usually also involved a cleansing step of some kind. Then, the transformed data would be "loaded" into a warehouse. This of course usually had an IT bottleneck, with various backlogs and other efforts, making it often a difficult, time-consuming, but necessary process for an organization to be able to drive any kind of business intelligence from their data assets. Hence, the acronym "ETL" for Extract, Transform, and Load - in that order - became an industry standard approach for data-driven organizations.

Now in the Cloud Database era, 75% of databases live on the Cloud. With much less friction in data availability and far fewer costs in data storage, the trend has started to shift to an "ELT" process. In this methodology, the order of operations has changed. After data is "Extracted", it is immediately "Loaded" and available for analysis in a Cloud Data Warehouse. This gets accessibility to raw data available to users more quickly.

Once loaded, "Transformation" can begin within the Cloud Data Warehouse (or all-encompassing Data Lake, also popular). So transformed data will be readily available, albeit a little later, and also augmented with the raw data from which the transformed data originally evolved from. In the era of Data Science and Advanced Analytics, access to source data and data lineage trails can be very useful, as it can be difficult to predict prior to transformation which data will be the most applicable to a given use case.

In the above diagram, you can see the various data sources from which data might be extracted from, moving as a function of time through both an ETL and ELT process. ETL does provide more control over data from the IT organization, which is useful in areas where compliance and control factors are important. However, it is still a slower process that ultimately results in more concise and less detailed data in a Data Warehouse or Data Lake.

In the bottom "ELT" scenario of the diagram, the data is more quickly available for potential analysis. There is also much more data available that could be of use in Analytics scenarios, has data lineage characteristics built-in, and can still undergo transformation processes along the way, including cleansing, summarization, and other types of data management and organization. It's like having both the book summary that students often use in educational scenarios, along with the entire book itself for reading or reference. Using both together will likely provide the richer experience and have the most impact.

These are not mutually exclusive approaches within an organization, and there is a place for both and tradeoffs to explore, but the trend in the era of the Cloud database is clearly more data faster, including raw source data and its corresponding lineage. We will continue to see ELT increase in usage versus the traditional ETL approach.

Cloud Native Data Engineering: Solutions for Databases
Create new, better, higher-value data based on your existing data
More Info...
Free Data Engineering Trial Credits
Register for an Interzoid API account and receive free trial credits. See how your strategic data assets can be improved.
Automate API Integration into Cloud Databases
Run live data quality exception and enhancement reports on major Cloud Data Platforms direct from your browser.
More Info...
Example API Usage Code on Github
Sample Code for invoking APIs on Interzoid in multiple programming languages
Business Case: Cloud APIs and Cloud Databases
See the business case for API-driven data enhancement - directly within your important datasets
More Info...

All content (c) 2018-2022 Interzoid Incorporated. Questions? Contact support@interzoid.com

201 Spear Street, Suite 1100, San Francisco, CA 94105-6164

Interested in data cleansing services?

Terms of Service
Privacy Policy
Use the Interzoid Cloud Connect Data Platform and Start to Supercharge your Cloud Data now (Free Trials): connect.interzoid.com
API Integration Option Code Examples: www.github.com/interzoid