Matching Showcase
Matching across two or more lists/databases.
Want to check for duplicates between two or more lists?
You can use match methods to find duplicated entries that exist in two or more lists that you are comparing. See below to learn how Kleber uses fuzzy logic and phonetics to find even the tricky duplicated entries!
Overview
Comparing two or more lists usually means you need to identify the duplicates so you can either not contact them for a particular reason or so that you can remove them from a combined database to make one clean dataset. For example – you may have bought a marketing list and need to know if any of your clients already exist in it. Or you could be merging two databases from separate sources (e.g. your sales and admin databases). Or you may need to find out if a client exists on a ‘Do not contact’ list to ensure you meet direct marketing rules etc etc. Duplicates aren’t necessarily easy to find in your dataset due to:
- the same person entered multiple times at the same address with slight spelling mistakes in either the name, business name OR address
- the same person entered multiple times at different addresses
- the same person entered multiple times at different businesses
- multiple people entered for the same address (which is important if you only want to send one item to the household, business), etc.
This is where Kleber can help! It uses fuzzy logic, phonetics and intelligent weighting to ensure if identifies as many duplicates as possible.
How was this demo created?
The image above shows a simplified example of what is really occurring when you use a Kleber Match method.
The method chosen for this example is DataTools.Match.BusinessNameAndAddress.Au.CreateKeys, but the process is the same for all match methods.
First – enter the details required (in this case the business name and address) and run the method. Kleber will generate unique match keys to identify the different elements of the business name and address.
The Kleber CreateKey methods generate identical keys by firstly identifying the elements of the supplied data using advanced parsing; then applying the process to each of the elements depending on their meaning and lastly placing the processed elements together to create the final match key.
Creating match keys this way means that even if the data is provided in different fields; in a different order; formatted differently or misspelt – identical keys are created. With identical keys finding matches is as easy as finding where the keys match exactly. This means no “like” matching is required because all the “fuzziness” has been taken into account when the key is generated.
Specific example of match keys
If we look specifically at the two entries below you can see the actual input given and output provided using Kleber.
The DataTools.Match.BusinessNameAndAddress.Au.CreateKeys used in this example creates nine different keys depending on how closely or loosely matched you need the business name and/or address to be.
You could use all or any of the keys provided to find a match.
The keys could have been generated when you created the record; changed a record; or as part of a batch exercise to find all matches. We suggest that once you create the keys – that you save them in your database with your data to speed up any future matching requirements.
What are the differences between Tight, Standard and Loose Keys?
Match keys are created in 3 varieties, Tight, Standard and Loose.
Tight keys allow for little difference between matches. Tight keys are useful for matching where no user interaction is available.
Loose keys will identify a lot more matches some of which may be questionable but will assist in identifying the last couple percent of matches that are difficult to find. Loose Keys should never be use without user interaction to verify matches.
Standard keys are a good balance between Tight and Loose key and depending on the outcome required may be used without user interaction.
It’s best to test keys on large quantities of data verifying the results to see which key or what combination of keys best suit your business requirements.
Click here for a more detailed explanation of the match keys.
License considerations.
The match methods are proprietary to DataTools. Users should read the DataTools terms and conditions to make sure intended use complies. They can be found here.
About Kleber
Kleber is a software platform from DataTools that contains a number of methods to deal with data in different ways. For example, data verification, capture, parsing, repair, matching, geocoding, enhancement and more! These methods can be used in isolation or in combination to create complete solutions.
We’ve also partnered with a number of different companies to deal with various types of data – such as addresses (national and international), email addresses, phone numbers, geocodes etc.
To learn more about Kleber, click here.
To see more showcases, click here.