Crossbeam finds overlaps in the data sets of disparate companies, which leads to an obvious problem: How do we decide when two records are a match? That's where our matching algorithm comes in.

The guiding force behind Crossbeam's matching algorithm is the concept of confidence. Because a match often results in data being shared, Crossbeam requires an extremely high level of confidence in order to consider two records as a "match."

In other words, false positives (when two unrelated records are incorrectly declared a match) are far worse than false negatives (when two matching records are incorrectly declared a non-match), and our algorithm is weighted as such.

Customers can not customize or modify the Crossbeam matching algorithm.

We compare multiple properties on any given record to develop a confidence score, but we place an extremely strong emphasis on properties that are unique to a given person or company. Let's explore a few data points that are important to matches.

Domain Names

Domain names are a source of high-confidence company matches, as no two companies can have the same domain. We run domain names through a standardization process to ensure that inconsistencies in formatting don't create false positives. We also maintain a growing awareness of cases where multiple domains are owned by the same company so that indirect matches can be made. Things to note about domain names:

Crossbeam will strip out anything after the top level domain (TLD), i.e. google.com will match google.com/en
Subdomains are not stripped out, and will not match the main domain alone, i.e. flights.google.com will not match google.com
Capitalization and slashes do not matter, i.e. GoOgle.com will match google.com
TLD differences (.com vs .net) will be treated as separate accounts, i.e. google.com will not match google.net

Email Address

Email addresses are a source of high confidence person matches, as no two people can have the same email address. These addresses are run through a similar cleansing and standardization process as domains. Emails also have a bonus benefit of helping with company match resolution, as we can often determine that companies match based on them having matching people. When certain quality conditions are met, we can also use the domain name of contacts as a matching property for companies.

Real-World ("Meatspace") Names

Real-World ("Meatspace") Names (or, worse yet, ones that are similar) alone are a bad source of matches. Without a secondary characteristic to validate the match, a simple name-based comparison typically does not provide the confidence we need to make a match determination. Crossbeam's matching algorithm is able to do name-based matching because of its use of additional dimensions.

Matching Engine

Our matching engine gets smarter every month, as the amount of training data and special situations we see increases. As such, you may see occasional minor shifts in the match rates between data sets. This is normal and is always associated with an increase in the quality of the matching methodology.

FAQ

Can you match using DUNS numbers?

Crossbeam cannot create a match using DUNS numbers.

Can you match using other dimensions?

Not currently, no, but our matching engine is always getting more intelligent as we add more matching dimensions.

Glossary of Crossbeam Terms

Reporting Incorrect Matches

Getting Crossbeam data to display in Clari

Crossbeam Ecosystem Overlaps Custom Object in Salesforce v2

Crossbeam Partner Collaboration Session