Introducing our Snowflake Data Cloud Native Application: AI-Driven Data Quality built into SQL statements! Learn More

Redirected Output of Data Analysis Processes for Data Workflow

by Interzoid Team


Posted on March 9th, 2023


Redirected Data Output

Generally, when you run a process, job, or executable from the command line, the output goes directly to the screen. However, in many cases it makes sense to save the output/results to a file. This enables the results to be scheduled, stored, fed into another process, sent to someone else, or be used as part of a business process.

This output redirection capability is what allows the creation of batch scripts, provides for scheduled processing using various scheduling techniques, enables ongoing quality control processing, and can also be part of data pipelines in ELT/ETL scenarios.

The output redirection is achieved using a greater-than symbol followed by the filename (see below). Since many of Interzoid's data analysis capabilities work with entire datasets, including Cloud SQL database tables and CSV files on the Cloud, and can be run with a single http query, redirecting output is of tremendous benefit.

For example, the following match process can be tested against our demo company name file (CSV source, no credits used). Just put the following URL in your browser address bar and hit enter. You will see a CSV file with a column of company names clustered and sorted by the algorithmically generated similarity keys:


    https://connect.interzoid.com/run?function=match&apikey=use-your-own-api-key-here&source=CSV&connection=https://dl.interzoid.com/csv/companies.csv&table=CSV&column=1&category=company&html=true
                

Running with 'Curl'

You can also run this command from a Linux, Windows, or Macintosh command line using "Curl" (must use double quotes within Curl on Windows). Curl (also known as cURL) is a command line HTTP client tool that is generally available by default on most computers:

Linux & Mac

    $ curl 'https://connect.interzoid.com/run?function=match&apikey=use-your-own-api-key-here&source=CSV&connection=https://dl.interzoid.com/csv/companies.csv&table=CSV&column=1&category=company'
                
Windows

    > curl "https://connect.interzoid.com/run?function=match&apikey=use-your-own-api-key-here&source=CSV&connection=https://dl.interzoid.com/csv/companies.csv&table=CSV&column=1&category=company"
                

Redirecting Output from the HTTP Query String

Output from these curl commands is redirected to output files for further processing using the greater-than symbol in both Linux & Windows.

Linux & Mac

    $ curl '[HTTP query string]' > output.csv
                
Windows

    > curl "[HTTP query string]" > output.csv
                

HTTP query strings run by Curl with redirected output provide the foundation for some fairly sophisticated automated workflows and ongoing quality control processes, including those built directly into data pipelines.

Here are some other examples of full dataset data analysis and data quality capabilities that can be used in this way:

Email Address Validation of dataset columns from the command-line.

Range checking of dataset columns from the command-line.

Basic statistical analysis of dataset columns from the command-line.

Identifying non-uniqueness of dataset columns from the command-line.

Elementizing/tokenizing dataset columns from the command-line.

And several more are available within our Interzoid Cloud Data Connect application.

See our Snowflake Native Application. Achieve Data Quality built-in to SQL statements.
Identify inconsistent and duplicate data quickly and easily in data tables and files.
More...
Connect Directly to Cloud SQL Databases and Perform Data Quality Analysis
Achieve better, more consistent, more usable data.
More...
Try our Pay-as-you-Go Option
Start increasing the usability and value of your data - start small and grow with success.
More...
Launch Our Entire Data Quality Matching System on an AWS EC2 Instance
Deploy to the instance type of your choice in any AWS data center globally. Start analyzing data and identifying matches across many databases and file types in minutes.
More...
Free Usage Credits
Register for an Interzoid API account and receive free usage credits. Improve the value and usability of your strategic data assets now.
Automate API Integration into Cloud Databases
Run live data quality exception and enhancement reports on major Cloud Data Platforms direct from your browser.
More...
Check out our APIs and SDKs
Easily integrate better data everywhere.
More...
Example API Usage Code on Github
Sample Code for invoking APIs on Interzoid in multiple programming languages
Business Case: Cloud APIs and Cloud Databases
See the business case for API-driven data enhancement - directly within your important datasets
More...
Documentation and Overview
See our documentation site.
More...
Product Newsletter
Receive Interzoid product and technology updates.
More...