Computer Science Colloquia
Tuesday, December 15, 2015
Advisor: Kevin Skadron
Attending Faculty: Worthy Martin (Chair); Gabriel Robins, Yanjun Qi
2:00 PM, Rice Hall, Rm. 242
PhD Qualifying Exam Presentation
Entity Resolution Acceleration using Micron's Automata Processor
Entity Resolution (ER), the process of finding identical entities across different databases, is critical to many information integration applications. As sizes of databases explode in the big-data era, it becomes computationally expensive to recognize identical entities for all records with variations allowed across multiple databases. Profiling results show that approximate matching is the primary bottleneck. Micron's Automata Processor (AP), an efficient and scalable semiconductor architecture for parallel automata processing, provides a new opportunity for hardware acceleration for ER. We propose an AP-accelerated ER solution, which accelerates the performance bottleneck of fuzzy matching for similar but potentially inexactly-matched names, and use a real-world application to illustrate its effectiveness. Results show $121$x to $4200$x speedups for matching one record, with better accuracy (9.2\% more correct pairs and 43\% less generalized merge distance cost) over the existing CPU method. The proposed method works even faster with improved algorithm.