Online seminar: Learn how to master data deduplication in your system and manage health records consistently.
Come join RLIG at the Joint Statistical Meetings (JSM) conference on Thursday, August 11, from 7:00 - 8:00 AM in room CC-210
David Beauchemin will discuss how duplicate detection can be used with AI and NLP techniques to extract information from external data sources using text distance metrics (e.g. Jaro) and classification algorithms. He will use a case study in insurance to demonstrate the proposed approach.
In this talk, Mr. Resnick will give an overview of the Fellegi-Sunter approach, explaining how candidate pair are evaluated under it. He will also cover extensions and modifications to it.
Roee Gutman will present work that view record linkage as a missing data problem and he will describe Bayesian procedures that utilize data features that are frequently encountered in public health applications.
Data Analysis after Record Linkage: sources of error, consequences, and possible solutions, Dr. Martin Slawski, Department of Statistics, Volgenau School of Engineering, George Mason University
Deepparse: a state-of-the-art Python library for parsing multinational street addresses using deep learning.
For February’s Linkage Seminar, Amy O’Hara will be joined by Thais Menezes, a PhD student at SFI Centre for Research Training in Foundations of Data Science, University College Dublin. Thais’ work incorporates the household structure in recording linkage matching in the Ireland census database to make the process of matching individuals easier and more accurate.
The Duke Graduate and Professional Student Government (GPSG) Community Pantry is a student-operated food pantry serving the student community at Duke University. In this post, I describe the record linkage system used at the Pantry to identify individual customers and obtain their order history. This is done using a Python module for deterministic record linkage and model evaluation techniques which I describe in detail.
Welcome to the blog of ASA's Record Linkage Interest Group.