Try our Interactive Data Client: a revolutionary, AI-Powered, custom data retrieval tool. Retrieve ANY data on ANY subject within seconds: Start Now!

Generating a Company Name Match Report in Node.js

Duplicate and inconsistent company names cause significant downstream issues in CRM systems, sales operations, analytics pipelines, account hierarchies, identity graphs, and more. Slight variations like "IBM", "I.B.M.", "International Bus Machines", and "Intl. Business Machine Co" all refer to the same organization, yet traditional string comparison or fuzzy similarity algorithms often fail to recognize them as equivalent.

This blog entry demonstrates how a simple Node.js script can generate a high-quality match report by assigning a similarity key to each company name using:

  • Interzoid’s AI- and ML-powered company matching algorithms
  • Normalization and domain-specific knowledge bases
  • Advanced semantic processing far beyond Levenshtein or fuzzy matching libraries

The result is a fast, accurate clustering of duplicate or equivalent company names—critical for deduplication, account unification, data cleansing, and boosting ROI across operational and analytical workflows.

Relevant GitHub example:
/company-name-matching/node-examples/generate-match-report.js

Requirements:
  • Node.js 14+
  • An Interzoid API key: Register here
  • A text file of company names (one per line)

How This Works: AI-Powered Similarity Keys

Each company name is sent to the API: getcompanymatchadvanced.

Instead of returning a fuzzy score or an edit-distance calculation, the API returns a Similarity Key (SimKey)—a canonical representation of the normalized company name, generated using:

  • AI/ML linguistic models
  • Domain-trained normalization logic
  • Extensive knowledge bases of company naming variations

Any text variation that represents the same company will share the same SimKey. This is vastly more reliable than fuzzy matching or Levenshtein distance, which operate on raw character sequences without understanding semantics or organization naming conventions.

Variations correctly clustered include:

  • GE, Gen Electric, General Electric Co.
  • Bank of America, BOA, Bnk of America Corp.
  • Google, Google LLC, Google Incorporated

The logic is identical to Interzoid’s other match-report examples for:

  • Individual names (full-name matching)
  • Street addresses (address normalization + matching)

All follow the same match-report pattern: generate similarity keys → sort → cluster → output.

The Node.js Script

Below is the full Node.js example script from the Interzoid Platform repository. It reads a file of company names, computes similarity keys, sorts them, and prints clusters of duplicates.

// generate-match-report.js

const fs = require("fs");
const https = require("https");

// Replace this with your API key from https://www.interzoid.com/manage-api-account
const API_KEY = "YOUR_API_KEY_HERE";

// Input file containing one company name per line
const INPUT_FILE_NAME = "sample-input-file.txt";

/**
 * Calls Interzoid's getcompanymatchadvanced API for a single company name
 * and returns a Promise resolving to the similarity key (SimKey) as a string.
 * Returns an empty string on error or if SimKey is missing.
 */
function callCompanyMatchAPI(companyName) {
  return new Promise((resolve) => {
    // URL-encode the company name to safely embed it in the query string
    const companyParam = encodeURIComponent(companyName);

    const apiURL =
      "https://api.interzoid.com/getcompanymatchadvanced" +
      `?license=${API_KEY}` +
      `&company=${companyParam}` +
      "&algorithm=model-v4-wide";

    https
      .get(apiURL, (res) => {
        let data = "";

        res.on("data", (chunk) => {
          data += chunk;
        });

        res.on("end", () => {
          try {
            const json = JSON.parse(data);
            const simKey = json.SimKey || "";
            resolve(simKey);
          } catch (err) {
            console.error(
              `Error parsing JSON for "${companyName}": ${err.message}`
            );
            console.error("Raw response:", data);
            resolve("");
          }
        });
      })
      .on("error", (err) => {
        console.error(`Error calling API for "${companyName}": ${err.message}`);
        resolve("");
      });
  });
}

async function main() {
  // Each record will hold the original input and its similarity key
  const records = [];

  // Read the input file contents
  let fileContents;
  try {
    fileContents = fs.readFileSync(INPUT_FILE_NAME, "utf8");
  } catch (err) {
    console.error("Error reading input file:", err.message);
    return;
  }

  // Split into lines and process each non-empty line
  const lines = fileContents.split(/\r?\n/);

  for (const line of lines) {
    const company = line.trim();

    // Skip blank lines
    if (!company) continue;

    const simKey = await callCompanyMatchAPI(company);

    // Skip if no SimKey returned
    if (!simKey) continue;

    records.push({ input: company, simKey });
  }

  if (records.length === 0) {
    console.log("No records with similarity keys found.");
    return;
  }

  //--------------------------------------------------------------------
  // Sort records strictly by simKey only so that all matching keys
  // are adjacent in the array. This makes it easy to find clusters.
  //--------------------------------------------------------------------
  records.sort((a, b) => a.simKey.localeCompare(b.simKey));

  //--------------------------------------------------------------------
  // Walk the sorted list and build clusters of records that share
  // the same simKey. Only print clusters of size >= 2.
  //--------------------------------------------------------------------
  let currentKey = null;
  let cluster = [];

  function printCluster(c) {
    if (c.length < 2) return; // Only print clusters with 2 or more
    for (const r of c) {
      console.log(`${r.input},${r.simKey}`);
    }
    console.log(); // blank line between clusters
  }

  for (const rec of records) {
    if (rec.simKey !== currentKey) {
      if (cluster.length > 0) printCluster(cluster);
      currentKey = rec.simKey;
      cluster = [rec];
    } else {
      cluster.push(rec);
    }
  }

  // Final cluster
  if (cluster.length > 0) printCluster(cluster);
}

main().catch((err) => {
  console.error("Unexpected error:", err);
});
                    

Interpreting the Output

The output is a set of clusters:

GE,7p3fj92x8as2
Gen Electric,7p3fj92x8as2
General Electric Co,7p3fj92x8as2

IBM,08xs81snnq2l
International Bus Machines,08xs81snnq2l
                    

Each group represents names that refer to the same organization. The script only outputs clusters with at least two entries, making it ideal for deduplication and review workflows.

Why This Beats Fuzzy Matching / Levenshtein Distance

Classic fuzzy matching approaches struggle with:

  • Acronyms vs full names (GE vs General Electric)
  • Semantic equivalence (“IBM” vs “International Business Machines”)
  • Corporate suffix noise (Inc, LLC, Ltd., Corp.)
  • Missing or extra tokens
  • Cross-language variations

Interzoid’s matching engine is AI- and ML-powered, built on:

  • Normalization corpuses
  • Specialized lexical knowledge bases
  • Semantic models trained on organizational naming behavior
  • Advanced string feature extraction

It understands intent behind organization names, not just surface-level characters. This produces match results that are dramatically more accurate than raw string distance.

With only a few lines of Node.js, you can generate enterprise-grade match reports that accurately identify duplicate or variant company names—something fuzzy matching libraries consistently fail to do at scale. The same Interzoid-powered workflow is available for individual name matching and street address matching, giving you a unified way to clean and deduplicate all major entity types.

Try the example script, use the sample file in the GitHub repo, and integrate similarity-key-based clustering into your operational and analytics pipelines to dramatically improve data quality and ROI.

AI Interactive Data Client: Request and Receive Structured Data of Any Kind on Any Subject.
Also, turn your structured data requests into an API call to integrate anywhere with different input parameters.
More...
Github Code Examples
Code examples for multiple scenarios such as easy integration, appending data via files in batch, generating match reports, and much more...
More...
Generate your own Datasets: Retrieve Customized, Real-World Data on Demand as Defined by You
Get results immediately - with infinite possibilities.
More...
High-Performance Batch Processing: Call our APIs with Text Files as Input.
Perform bulk data enrichment using CSV or TSV files.
More...
Try our Pay-as-you-Go Option
Start increasing the usability and value of your data - start small and grow with success.
More...
Available in the AWS Marketplace.
Optionally add usage billing to your AWS account.
More...
Free Trial Usage Credits
Register for an Interzoid API account and receive free usage credits. Improve the value and usability of your strategic data assets now.
Check out our full list of AI-powered APIs
Easily integrate better data everywhere.
More...
Documentation and Overview
See our documentation site.
More...
Product Newsletter
Receive Interzoid product and technology updates.
More...