Proceedings of the 4th World Conference on Management and Economics
Address Management in The Logistics Industry Industry: Address Singularization with Text Similarity Algorithms
Ahmet Ustaoğlu, Hasan Güney, Ahmet Yesevi Türker, Tuğçe Elçi, Deniz Kantar
In logistics and supply chain management, addresses are the fundamental elements of management systems that somewhere facilitate the delivery of the goods from the origin to the destination. The logistics sector purposes to establish simple, easily perceptible, and accurate address repository. The expansion of logistics operations constantly leads to the creation of new addresses, but having too many addresses referencing the same geographical location can negatively affect route optimization. Also, wrong addresses may cause a huge amount of cost due to the re-delivery of the shipments. In this research, we propose the architecture and the implementation of an address management algorithm developed by Borusan Logistics AI team, aimed at eliminating redundant and ambiguous address entries. This approach includes the singularization of addresses and the quality control of addresses. The singularization part consists of the deduplication of addresses and the identifying of ambiguous or unclear addresses. In this part, the Grid Search algorithm is developed to detect duplicated and ambiguous address. Then, the control part includes the most beneficial algorithm to mitigate pollution of the address pool. In this part, the Jaro-Winkler, Hamming, Levenshtein Distance, FuzzyWuzzy, String Score, and Sequences Matching algorithms are calculated and considered. Then, to finalize the smart decision algorithm, the most suitable algorithms in terms of their performance are combined and used to calculate the final similarity score of each address. As a result, an end-to-end address management system was built that not only deduplicates but also controls address entries, ensures the integrity of the address pool.
keywords: Address Similarity, Data Deduplication, Address Management, Grid Search, String Similarity Algorithms