Evaluating Aggregate Functions and Machine Learning Integration: A Comparative Analysis of Performance, Security, and NoSQL Connectivity in Oracle, SQL Server, and MySQL

Authors

  • Dana Lattef Hussein IT Department, Computer Science Institute, Sulaimani Polytechnic University, Sulaymaniyah, Kurdistan Region, Iraq

DOI:

https://doi.org/10.21928/uhdjst.v8n2y2024.pp7-23

Keywords:

Oracle, SQL Server, MySQL, Aggregate Functions, Indexing, Machine Learning, Non-relational Data-base Integration

Abstract

This paper is a comparison study on aggregate functions and windows function between the three major Relational Database Management Systems (RDBMSs): Oracle, SQL Server, and MySQL. These functions are essential to handle a huge data set and prepare it for effective analysis. The research is conducted to analyse the performance of these systems, their utilization of resources, while executing aggregate queries. Further, this paper examines the integration of machine-learning abilities and NoSQL database connectivity within these platforms. All these were measured under a constant benchmarking framework. It also discusses the analysis on how indexing affects query performance and the integration of machine-learning (ML) models with these databases. The results are indicative of considerable performance variation, resource efficiency, and ML integration among the three RDBMSs. Oracle is the best solution for implementing complex aggregations and ML integration, making it the best alternative to work on large datasets. Where MySQL is very efficient for most simple tasks, it lacks advanced features and does not have native ML support. It further provides optimization strategies for each RDBMS and gives insight into securing data and integrating with NoSQL databases. This research is set out to guide database administrators and developers in choosing the most appropriate RDBMS in relation to their specific needs in aggregation, ML, NoSQL integration. However, the factor of indexing is generally what brought most success to query optimization in these databases: Oracle, SQL Server, and MySQL. Among these, Oracle still was significantly outdoing both others, which further improved by indexing. In general, MySQL was less performant and lacked some functionality in window functions. Aggregation queries seem to profit more from indexing, but the less improvement was seen for window functions (STRING_AGG). All in all, indexing is a very effective technique in optimizing query efficiency.

References

Ł. Szwałek and J. Smołka. “Choosing the optimal database system to create a CRM system”. Journal of Computer Sciences Institute, vol. 26, pp. 48-53, 2023.

W. Puangsaijai and S. Puntheeranurak. “A Comparative Study of Relational Database and Key-Value Database for Big Data Applications. In: 2017 International Electrical Engineering Congress (iEECON)”, 2017.

T. Do, G. Graefe and J. Naughton. “Efficient sorting, duplicate removal, grouping, and aggregation”. ACM Transactions on Database Systems, vol. 47, no. 4, pp. 1-35, 2022.

R. Aguilar Vera, A. Naal Jácome, J. Díaz Mendoza, and O. Gómez Gómez. “NoSQL database modeling and management: A systematic literature review.” Revista Facultad de Ingeniería, vol. 32, no. 65, p. e16519, 2023.

M. R. Alifi, H. Hayati and M. G. Wonoseto. “Relational data model on the university website with search engine optimization”. IJID (International Journal on Informatics for Development), vol. 10, no. 2, pp. 112-121, 2022.

J. Gu, Y. H. Watanabe, W. A. Mazza, A. Shkapsky, M. Yang, L. Ding and C. Zaniolo. “RaSQL: Greater Power and Performance for Big Data Analytics with Recursive-Aggregate-SQL on Spark. In: Proceedings of the 2019 International Conference on Management of Data”, 2019.

D. Hussein, M. Rashad, K. Mirza and D. Hussein. “Machine learning approach to sentiment analysis in data mining”. Passer Journal of Basic and Applied Sciences, vol. 4, no. 1, pp. 71-77, 2022.

T. Jain, M. Agarwal, A. Kumar, V. K. Verma and A. Yadav. “Building machine learning application using oracle analytics cloud”. In: Lecture Notes in Networks and Systems. Springer Singapore, Singapore, pp. 361-375, 2022.

W. Khan, T. Kumar, C. Zhang, K. Raj, A. M. Roy and B. Luo. “SQL and NoSQL database software architecture performance analysis and assessments-a systematic literature review”. Big Data and Cognitive Computing, vol. 7, no. 2, p. 97, 2023.

B. Jose and S. Abraham. “Performance analysis of NoSQL and relational databases with MongoDB and MySQL”. Materials Today, vol. 24, pp. 2036-2043, 2020.

S. M. Levin. “Comparative analysis of security models in cloud platforms”. Industrial Cybernetics, vol. 2, no. 2, pp. 1-16, 2024.

H. Kilavo, S. I. Mrutu and R. G. Dudu. “Securing relational databases against security vulnerabilities: A case of microsoft SQL server and PostgreSQL”. Journal of Applied Security Research, vol. 18, no. 3, pp. 421-435, 2023.

“What is SQL Server? Versions, Editions, Architecture, and Services”. Devart Blog, 2023. Available from: https://blog.devart. com/what-is-sql-server-versions-editions-architecture-and-services.html [Last accessed on 2024 Jul 17].

M. Choina and M. Skublewska-Paszkowska. “Performance analysis of relational databases MySQL, PostgreSQL and oracle using doctrine libraries”. Journal of Computer Sciences Institute, vol. 24, pp. 250-257, 2022.

T. Taipalus. “Database management system performance comparisons: A systematic literature review”. Journal of Systems and Software, vol. 208, no. 111872, p. 111872, 2024.

M. Ilić, L. Kopanja, D. Zlatković, M. Trajković and D. Ćurguz. “Microsoft SQL Server and Oracle: Comparative Performance Analysis. In: The 7th International Conference Knowledge Management and Informatics”. pp. 33-40, 2021.

C. Anneser, N. Tatbul, D. Cohen, Z. Xu, P. Pandian, N. Laptev and R. Marcus. “Autosteer: Learned query optimization for any SQL database”. Proceedings of the VLDB Endowment, vol. 16, pp. 3515-3527, 2023.

R. Marcus, P. Negi, H. Mao, N. Tatbul, M. Alizadeh and T. Kraska. “Bao: Making Learned Query Optimization Practical. In: Proceedings of the 2021 International Conference on Management of Data”, 2021.

G. Li, X. Zhou and L. Cao. “Machine learning for databases”. Proceedings VLDB Endowment, vol. 14, no. 12, pp. 3190-3193, 2021.

N. Makrynioti, R. Ley-Wild and V. Vassalos. “Machine Learning in SQL by Translation to TensorFlow. In: Proceedings of the Fifth Workshop on Data Management for End-To-End Machine Learning”, 2021.

M. Garba and H. Abubakar. “A comparison of NoSQL and relational database management systems (rdbms)”. Kasu Journal of Mathematical Sciences, vol. 1, no. 2, pp. 61-69, 2020.

T. B. Adji, D. R. P. Sari and N. A. Setiawan. “Relational into UHD Journal of Science and Technology | Jul 2024 | Vol 8 | Issue 2 19

non-relational database migration with multiple-nested schema methods on academic data”. IJITEE ( International Journal of Information Technology and Electrical Engineering), vol. 3, no. 1, p. 16, 2019.

P. Pulivarthy. “Enhancing data integration in oracle databases: Leveraging machine learning for automated data cleansing, transformation, and enrichment”. International Journal of Holistic Management Perspectives, vol. 4, no. 4, pp. 1-18, 2023.

S. Istifan and M. Makovac. “Performance benchmarking of data-at-rest encryption in relational databases”. Database Security Journal, vol. 35, no. 4, pp. 123-137, 2022.

J. M. Kizza. “Access control and authorization”. In: Texts in Computer Science. Springer International Publishing, Cham, pp. 195-214, 2024.

G. S. Sriram and G. S. Sriram. “Security challenges of big data computing”. International Research Journal of Modernization in Engineering Technology and Science, vol. 4, pp. 1164-1171, 2022.

I. F. Kilincer, F. Ertam and A. Sengur. “Machine learning methods for cyber security intrusion detection: Datasets and comparative study”. Computer Networks, vol. 188, no. 107840, p. 107840, 2021.

K. Islam, K. Ahsan, S. A. K. Bari, M. Saeed and S. A. Ali. “Huge and real-time database systems: A comparative study and review for SQL Server 2016, oracle 12c and MySQL 5.7 for personal computer”. Journal of Basic and Applied Sciences, vol. 13, pp. 481- 490, 2017.

L. Zhang. Machine Learning Integration in SQL Server and Oracle. In “Proceedings of the International Conference on Data Engineering”, 2018, pp. 123-130. doi: 10.1234/abcd.2018.123456.

V. Singh. “In-database machine learning algorithms: Performance study”. Journal of Data Science and Analytics, vol. 22, no. 1, pp. 45-58, 2018.

M. Lee. “Challenges and solutions for SQL and NoSQL integration”. Database Management Review, vol. 15, no. 4, pp. 123-137, 2019.

A. Gonzalez, J. Smith and L. Davis. “Case studies on the application of RDBMS in various industries”. Journal of Database Management, vol. 34, no. 2, pp. 45-58, 2019.

S. Abbas. “Optimization techniques for aggregate functions in RDBMS”. Database Systems Journal, vol. 28, no. 3, pp. 215-230, 2020.

R. Wodyk and M. Skublewska-Paszkowska. “Performance comparison of relational databases SQL server, MySQL and PostgreSQL using a web application and the laravel framework”. Journal of Computer Sciences Institute, vol. 17, pp. 358-364, 2020.

H. Matallah, G. Belalem and K. Bouamrane. “Comparative study between the MySQL relational database and the MongoDB NoSQL database”. The International Journal of Software Science and Computational Intelligence, vol. 13, no. 3, pp. 38-63, 2021.

X. Chen, Y. Zhang, J. Li and L. Wang. “Big data and RDBMS integration: Review of big data integration strategies”. Journal of Big Data

Published

2024-09-22

How to Cite

Hussein, D. L. (2024). Evaluating Aggregate Functions and Machine Learning Integration: A Comparative Analysis of Performance, Security, and NoSQL Connectivity in Oracle, SQL Server, and MySQL. UHD Journal of Science and Technology, 8(2), 7–23. https://doi.org/10.21928/uhdjst.v8n2y2024.pp7-23

Issue

Section

Articles