data engineering apis

ELT Versus ETL in the Era of Cloud Databases

by Interzoid Team


Posted on August 30th, 2022


Normalize your data

ELT is emerging as a preferred approach in the era of Cloud Databases versus traditional ETL. Here's why.

Over the decades, the term "ETL" had become the standard method to move data from operational systems to a data warehouse or equivalent for analysis. First, an IT organization-led team would design a process to "extract" data from its sources, then apply various rules to "transform" the data into the shape it needed to be per the specifications of a Data Warehouse. This usually also involved a cleansing step of some kind. Then, the transformed data would be "loaded" into a warehouse. This of course usually had an IT bottleneck, with various backlogs and other efforts, making it often a difficult, time-consuming, but necessary process for an organization to be able to drive any kind of business intelligence from their data assets. Hence, the acronym "ETL" for Extract, Transform, and Load - in that order - became an industry standard approach for data-driven organizations.

Now in the Cloud Database era, 75% of databases live on the Cloud. With much less friction in data availability and far fewer costs in data storage, the trend has started to shift to an "ELT" process. In this methodology, the order of operations has changed. After data is "Extracted", it is immediately "Loaded" and available for analysis in a Cloud Data Warehouse. This gets accessibility to raw data available to users more quickly.

Once loaded, "Transformation" can begin within the Cloud Data Warehouse (or all-encompassing Data Lake, also popular). So transformed data will be readily available, albeit a little later, and also augmented with the raw data from which the transformed data originally evolved from. In the era of Data Science and Advanced Analytics, access to source data and data lineage trails can be very useful, as it can be difficult to predict prior to transformation which data will be the most applicable to a given use case.

In the above diagram, you can see the various data sources from which data might be extracted from, moving as a function of time through both an ETL and ELT process. ETL does provide more control over data from the IT organization, which is useful in areas where compliance and control factors are important. However, it is still a slower process that ultimately results in more concise and less detailed data in a Data Warehouse or Data Lake.

In the bottom "ELT" scenario of the diagram, the data is more quickly available for potential analysis. There is also much more data available that could be of use in Analytics scenarios, has data lineage characteristics built-in, and can still undergo transformation processes along the way, including cleansing, summarization, and other types of data management and organization. It's like having both the book summary that students often use in educational scenarios, along with the entire book itself for reading or reference. Using both together will likely provide the richer experience and have the most impact.

These are not mutually exclusive approaches within an organization, and there is a place for both and tradeoffs to explore, but the trend in the era of the Cloud database is clearly more data faster, including raw source data and its corresponding lineage. We will continue to see ELT increase in usage versus the traditional ETL approach.

Getting Started with Interzoid
Three ways to achieve better, more usable, and higher value data with Interzoid
More...
Connect Directly to Cloud SQL Databases and Perform Data Quality Analysis
Achieve better, more consistent, more usable data
More...
Free Trial Credits
Register for an Interzoid API account and receive free trial credits. See how your strategic data assets can be improved.
Automate API Integration into Cloud Databases
Run live data quality exception and enhancement reports on major Cloud Data Platforms direct from your browser.
More...
Step-by-Step Tutorial for Data Matching
See quickly one example of how inconsistent data can be identified within databases and datasets with ease.
More...
Example API Usage Code on Github
Sample Code for invoking APIs on Interzoid in multiple programming languages
Business Case: Cloud APIs and Cloud Databases
See the business case for API-driven data enhancement - directly within your important datasets
More...
Documentation and Overview
See our documentation site.
More...
Product Newsletter
Receive Interzoid product and technology updates.
More...

All content (c) 2018-2023 Interzoid Incorporated. Questions? Contact support@interzoid.com

201 Spear Street, Suite 1100, San Francisco, CA 94105-6164

Interested in Data Cleansing Services?
Let us put our Machine Learning-based processes and data tools to work for you.

Start Here
Terms of Service
Privacy Policy

Use the Interzoid Cloud Connect Data Platform and Start to Supercharge your Cloud Data now.
Connect to your data and start running data analysis reports in minutes: connect.interzoid.com
API Integration Examples and SDKs: github.com/interzoid
Documentation and Overview: Docs site
Interzoid Product and Technology Newsletter: Subscribe