data engineering apis

Generating Company Match Reports in the Snowflake Data Cloud

by Interzoid Team


Posted on January 26th, 2021


Generating Company Match Reports in the Snowflake Data Cloud

In this walkthrough example, Interzoid’s Cloud Data Connect Product is used to generate a “match report” using Interzoid’s Cloud-based Company Matching API, which in turn uses algorithmically generated “similarity keys.” These keys will serve as the basis of a match to identify similar company names within a database. This is a significant capability in improving the usability and accuracy of data that exists within Snowflake.

If you do not have access to Snowflake, a free trial is available at: https://signup.snowflake.com, which will work fine for this walkthrough.

You can access Interzoid's Cloud Data Connect here.

If you need more usage credits for your API license key for more testing, contact support@interzoid.com and mention this blog.

The premise of matching data is that Interzoid’s Company Matching API generates “similarity keys” based on matching algorithms, machine learning, and data-specific knowledge bases. These keys help identify similar company or organization names within a single database or across multiple data sources.

The use of these similarity keys helps to overcome the issues and challenges caused by inconsistent alphanumeric data for representing company names within a dataset. The technique can provide significantly greater impact and value for Cloud data warehousing efforts in making the resident data cleaner, more consistent, more usable, and therefore more valuable. It can also help identify data quality challenges in underlying source data that feed a data warehouse that need to be addressed.

It is important and useful not only to identify cases of redundant data records in a single dataset, but data matching via similarity keys can also significantly enhance the ability to match across multiple datasets where data can be inconsistently represented. The ability to get higher match rates when combining data from multiple sources to create new datasets or enhance existing ones can also add significant value.

For example, the following similarity keys were generated for the company names below.

Similarity key generation example for a company name matching API

Using the Interzoid Company Matching API, likely matches will generate the same similarity key. You can see in the above example that similar company names generate the same similarity key. This makes them easy to identify as part of a data query.


How to connect to and create match reports within Snowflake:

There are four steps in this walkthrough:

Step #1: Load the sample data into Snowflake for this tutorial.

Step #2: Configure the data connection to Snowflake.

Step #3: Execute the match report.

Step #4: View the results.


Step #1 – Load the sample data into Snowflake

In this example, we will create a database within Snowflake and load a table from a simple CSV file. If you have not yet used Snowflake or do not have access to the Snowflake Cloud Data Warehouse, you can sign up for a free trial at: https://signup.snowflake.com

The sample CSV file we will use only has two columns, a company name and a description:

Download the example CSV file

Snowflake Company Match Report Example


Log in to your Snowflake account. First, create a new database (click “Create…”):

Snowflake Company Match Report Example


You can name the database whatever you want. However, for this tutorial, we will call it “INTERZOID”:

Snowflake Company Match Report Example


After you click “Finish”, then click on the newly created database from the list:

Snowflake Company Match Report Example


Now we will create a table for our data. From the Tables tab, click “Create” to create a new table:

Snowflake Company Match Report Example


We will create a new table called “COMPANIES”, and create two columns within that table, “COMPANY” and “DESCRIPTION” of type “string.” This matches the CSV file we are going to load into the table.

Snowflake Company Match Report Example


After the table is created, we will load the CSV file into the table. The default “COMPUTE_WH” warehouse is fine.

Snowflake Company Match Report Example


Next, browse to wherever you downloaded the CSV file via “Select Files…”

Snowflake Company Match Report Example


After you have selected your CSV file path, simply click the “Load” button.

Snowflake Company Match Report Example


Once completed, your table loading results will be shown. Click “OK”.

Snowflake Company Match Report Example


Step #2 - Configure the data connection to Snowflake

We are now ready to create the data connection so Interzoid Data Connect can access the data within our Snowflake table via the Cloud. We will use an ODBC (Open DataBase Connectivity) connection to access the Snowflake data from outside the Snowflake Web application. To access the data, we will need a driver.

CData provides an ODBC driver for Snowflake. There is a free trial available you can download for this tutorial. This ODBC connector will enable us to read the Snowflake data via the Cloud: https://www.cdata.com/drivers/snowflake/odbc/.

Download and install this connector. There is a trial version available which will work fine for this tutorial.

One complete, open the Windows ODBC Administrator tool. The 64-bit version of the Odbcad32.exe file is located in the %systemdrive%\Windows\System32 folder. You can also search for it by typing “ODBC” in the Windows Start Menu Search box.

Snowflake Company Match Report Example


Click on the “User DSN” tab. Click “Add” to add a new data source.

Snowflake Company Match Report Example


Highlight the CData ODBC driver for Snowflake and then click “Finish.”

Snowflake Company Match Report Example


You will then see the CData ODBC Driver for Snowflake configuration panel. We will set this up for our Snowflake table that we will generate our match report against.

Snowflake Company Match Report Example


To use the CData driver, we need to configure it. We first need to create a data source name. Call it “mysnowflakeodbc”. Next, we will use the default Auth Scheme. Under “Authentication” enter your UserID and Password for Snowflake. This will allow the connection to the Snowflake Data Cloud at runtime.

We will also need to set the Warehouse name, which will likely be the default “COMPUTE_WH”. If not, use whatever name you used for the warehouse within Snowflake.

Next, set your unique Snowflake instance URL that you were provided with when you set up your Snowflake environment. You get can the URL from your Snowflake account. We also need to set the name of the database. We used “INTERZOID” when we set up the database within Snowflake, so use that for this parameter.

Here is the configuration panel when complete:

Snowflake Company Match Report Example


When ready, click the “Test Connection” button. You should see the following:

Snowflake Company Match Report Example


We are now ready to setup Interzoid Data Connect to generate the match report using the Interzoid Company Match API and the data within the selected Snowflake table.


Step #3 - Execute the match report

To support Windows, Linux, and Darwin, as well as multi-step data pipelines, Interzoid Data Connect runs from the command line. Contact Interzoid at support@interzoid.com for access to the product, and we will provide you with the necessary API usage credits to run the walkthrough. Once installed, you simply run it from a Windows, Linux, MacOS, or Solaris command prompt with the following parameters (see documentation for more details):


function – name of Interzoid API function to call

target – native, odbc, or textfile

dsn – ODBC connection name if using ODBC

license – your Interzoid API license key

algo – matching algorithm used with the Company Match API

table – database table name within Snowflake

column – name of table column in database to use as basis for match algorithms



To execute, from the command line, enter the following:

$ [C:/installdir] izdc -function=companymatch -target=odbc -dsn="DSN=cdatasnowflake" -license=[YOUR INTERZOID API KEY] -algo=wide -table=COMPANIES -column=COMPANY

…and click [Enter]


Step #4: View the results

You can load the resultant output file into any viewing tool, such as Microsoft Excel or Notepad. You will see results like this, where sorted records that share the same Similarity Key are grouped together:

Snowflake Company Match Report Example


Where from here?

Now that matches have been identified, there are several potential courses of action to take, depending on business requirements.

For example, match report data can be imported into Snowflake for further analysis. Results can be saved/exported to a delimited file for review, importing to Snowflake, or for additional processing.

If this is a raw marketing list, redundant data could simply be eliminated to save outreach costs.

If this is a customer or prospect list, logic can be created to collapse redundant records into a single instance while maintaining important data from each of the records to maintain in the surviving record.

This walkthrough is only one example of integrating Interzoid APIs within Snowflake. There are also matching APIs that include specific matching algorithms for other types of data such as individual names (Bob = Robert, Johnson = Jonsen). There are data enrichment APIs available to add additional data elements via the Cloud to existing data that can enhance various data initiatives using Snowflake.

The possibilities are limitless.

For more information, contact support@interzoid.com or visit https://www.interzoid.com


Check out our New Cloud Data Connect Data Matching Wizard!
Identify inconsistent and duplicate data quickly and easily in data tables and files.
More...
Connect Directly to Cloud SQL Databases and Perform Data Quality Analysis
Achieve better, more consistent, more usable data.
More...
Launch Our Entire Data Quality Matching System on an AWS EC2 Instance
Deploy to the instance type of your choice in any AWS data center globally. Start analyzing data and identifying matches across many databases and file types in minutes.
More...
Free Usage Credits
Register for an Interzoid API account and receive free usage credits. Improve the value and usability of your strategic data assets now.
Automate API Integration into Cloud Databases
Run live data quality exception and enhancement reports on major Cloud Data Platforms direct from your browser.
More...
Check out our APIs and SDKs
Easily integrate better data everywhere.
More...
Example API Usage Code on Github
Sample Code for invoking APIs on Interzoid in multiple programming languages
Business Case: Cloud APIs and Cloud Databases
See the business case for API-driven data enhancement - directly within your important datasets
More...
Documentation and Overview
See our documentation site.
More...
Product Newsletter
Receive Interzoid product and technology updates.
More...

All content (c) 2018-2023 Interzoid Incorporated. Questions? Contact support@interzoid.com

201 Spear Street, Suite 1100, San Francisco, CA 94105-6164

Interested in Data Cleansing Services?
Let us put our Generative AI-enhanced data tools and processes to work for you.

Start Here
Terms of Service
Privacy Policy

Use the Interzoid Cloud Connect Data Platform and Start to Supercharge your Cloud Data now.
Connect to your data and start running data analysis reports in minutes: connect.interzoid.com
API Integration Examples and SDKs: github.com/interzoid
Documentation and Overview: Docs site
Interzoid Product and Technology Newsletter: Subscribe
Partnership Interest? Inquire