2018 International Congress on Big Data, Deep Learning and Fighting Cyber Terrorism, IBIGDELFT 2018, Ankara, Turkey, 3 - 04 December 2018, pp.45-50
The financial data analysis, which is the road map of the future and at the same time the mirror of today, is of vital importance for many institutions. Therefore, it is common to apply statistical analysis on financial data. In such cases, data size becomes very important when performing financial data analysis. While analyzing the financial data, as the size and variety of data and increase, one can achieve the most accurate financial data analysis outcome. However, the increase in data size also brings some disadvantages such as performance-loss due to processing large-scale data. These disadvantages occur in both query performance and various functions that are used in data analysis. In this respect, it is necessary to examine the data storage platforms comparatively, which will investigate the performance of query and statistical functions, used in financial data analysis, at the highest level for large-scale financial data sets. For this purpose, the first step of this study was to compare the performance of the query on the Relational and Non-SQL-based storage environments, and to compare the performance of the query in the single-node and double-node in-memory NoSQL data storage environment. To facilitate testing of these platforms; as the SQL database system, MSSQL was selected and as the distributed in-memory NoSQL database system, Hazelcast was selected. For different data sizes on these platforms, the run times of the query and statistical functions were measured. In order to examine the ability of the in-memory NoSQL data storage platforms, to manage and manipulate the data, map-reduce programming model was used. Performance tests on single nodes and multiple nodes show that in-memory NoSQL platforms are very successful compared to relational database systems. In addition, it has been found that in-memory NoSQL storage platforms provide higher performance gains when using the map-reduce programming model.