Crossbeam Matching Engine

An explanation of how Crossbeam's matching engine determines matches at the person and company levels.

Bob Moore avatar
Written by Bob Moore
Updated over a week ago

Crossbeam finds overlaps in the data sets of disparate companies, which leads to an obvious problem: How do we decide when two records are a match? That's where our matching algorithm comes in. 

The guiding force behind Crossbeam's matching algorithm is the concept of confidence. Because a match often results in data being shared, Crossbeam requires an extremely high level of confidence in order to consider two records as a "match."

In other words, false positives (when two unrelated records are incorrectly declared a match) are far worse than false negatives (when two matching records are incorrectly declared a non-match), and our algorithm is weighted as such.

Customers can not customize or modify the Crossbeam matching algorithm.

We compare multiple properties on any given record to develop a confidence score, but we place an extremely strong emphasis on properties that are unique to a given person or company. Let's explore a few data points that are important to matches.

Domain Names

Domain names are a source of high-confidence company matches, as no two companies can have the same domain. We run domain names through a standardization process to ensure that inconsistencies in formatting don't create false positives. We also maintain a growing awareness of cases where multiple domains are owned by the same company so that indirect matches can be made. Things to note about domain names:

Email Address

Email addresses are a source of high confidence person matches, as no two people can have the same email address. These addresses are run through a similar cleansing and standardization process as domains. Emails also have a bonus benefit of helping with company match resolution, as we can often determine that companies match based on them having matching people. When certain quality conditions are met, we can also use the domain name of contacts as a matching property for companies. 

Real-World ("Meatspace") Names

Real-World ("Meatspace") Names (or, worse yet, ones that are similar) alone are a bad source of matches. Without a secondary characteristic to validate the match, a simple name-based comparison typically does not provide the confidence we need to make a match determination. Crossbeam's matching algorithm is able to do name-based matching because of its use of additional dimensions.

Matching Engine

Our matching engine gets smarter every month, as the amount of training data and special situations we see increases. As such, you may see occasional minor shifts in the match rates between data sets. This is normal and is always associated with an increase in the quality of the matching methodology.

FAQ

Can you match using DUNS numbers?

Crossbeam cannot create a match using DUNS numbers.

Can you match using other dimensions?

Not currently, no, but our matching engine is always getting more intelligent as we add more matching dimensions.

Did this answer your question?