Using Fuzzy Logic Technique to Eliminate the Duplicates in Large Database


  • Mortadha M. Hamad College of computer, University of Anbar, Ramadi, Iraq
  • Alaa Abdulqahar Jihad College of Computer, Anbar University, Ramadi, Iraq



Duplicate, data quality, data set, fuzzy logic


Duplicate records are broad problem in many of the databases. There are wide efforts focusing on elimination of duplicate in data sets, because is it important part of data cleaning. This paper focuses on discovery and removing duplication by using fuzzy logic technique.


[1] Vasarhelyi, M., and M. Greenstein, Underlying principles of the electronization of business: a research agenda, International Journal of Accounting Information Systems, 2003.
[2] Dr. Linda F. Ettinger, Improving the Data Warehouse with Selected Data Quality Techniques: Metadata Management, Data Cleansing and Information Stewardship, University of Oregon, December 2005.
[3] R. Arora, P. Pahwa, S. Bansal, Alliance Rules of Data Warehouse Cleansing, IEEE , International Conference on Signal Processing Systems, Singapore, May 2009, Page(s): 743 – 747.
[4] Kazi Shah, Ashiqur Rahman and G.M. Atiqur Rahaman, A Domain-Independent Data Cleaning Algorithm for Detecting Similar-Duplicates, Khulna University, Bangladesh, 2010.
[5] M.Anitha, A.Srinivas, T.P.Shekhar and D.Sagar, Duplicate Detection Of Records In Queries Using Clustering, Karimnagar, India, International Journal of Research in Computer Science eISSN 2249-8265 Volume 2 Issue 2, 2012.
[6] Rohit Ananthakrishna, Surajit Chaudhuri and Venkatesh Ganti, Eliminating Fuzzy Duplicates in Data Warehouses, Hong Kong, China, 2002.
[7] Joshua M. Horstman, Roger D. Muller, Dealing with Duplicates in Your Data, MWSUG 2011.
[8] Jean-Pierre Dijcks, Matching and Merging data – Black Art or Exact Science, Oracle Corporation, January 2008.
[9] So S.S., Cha S.D., Kwon Y.R. Empirical evaluation of a fuzzy logic-based software quality prediction model, Fuzzy Sets and Systems, 127 (2), pp. 199-208, 2002.
[10] Dariusz Mrozek, Fuzzy Data Warehouse and Fuzzy OLAP Project Home Page, Institute of informatics, Poland, Accessed in 26/01/2015.
[11] Zabeo A., Semenzin E., Torresan S., Gottardo S., Pizzol L.1, Rizzi J., Giove S., Critto A. and Marcomini A., Fuzzy logic based IEDSSs for environmental risk assessment and management, International Environmental Modelling and Software Society (iEMSs), 2010, Canada.
[12] Kunwar Babar Ali, Anjana Gosain, Predicting The Quality of Object-Oriented Multidimensional (OOMD) Model of Data Warehouse Using Fuzzy Logic Technique, International Journal of Engineering Science & Advanced Technology, 2012.
[13] Chi-Yuan Yeh, Wen-Hau Roger Jeng, and Shie-Jue Lee, Data-Based System Modeling Using a Type-2 Fuzzy Neural Network with a Hybrid Learning Algorithm, IEEE TRANSACTIONS ON NEURAL NETWORKS, Taiwan, 2011.