Big data stream analysis: a systematic literature review

Kolajo, Taiwo and Daramola, Olawande and Adebiyi, A. A. (2019) Big data stream analysis: a systematic literature review. Jounal of Big Data, 6.

[img] Text
Springer.pdf - Published Version

Download (1MB)


Recently, big data streams have become ubiquitous due to the fact that a number of applications generate a huge amount of data at a great velocity. This made it difcult for existing data mining tools, technologies, methods, and techniques to be applied directly on big data streams due to the inherent dynamic characteristics of big data. In this paper, a systematic review of big data streams analysis which employed a rigorous and methodical approach to look at the trends of big data stream tools and technologies as well as methods and techniques employed in analysing big data streams. It provides a global view of big data stream tools and technologies and its comparisons. Three major databases, Scopus, ScienceDirect and EBSCO, which indexes journals and conferences that are promoted by entities such as IEEE, ACM, SpringerLink, and Elsevier were explored as data sources. Out of the initial 2295 papers that resulted from the frst search string, 47 papers were found to be relevant to our research questions after implementing the inclusion and exclusion criteria. The study found that scalability, privacy and load balancing issues as well as empirical analysis of big data streams and technologies are still open for further research eforts. We also found that although, signifcant research eforts have been directed to real-time analysis of big data stream not much attention has been given to the preprocessing stage of big data streams. Only a few big data streaming tools and technologies can do all of the batch, streaming, and iterative jobs; there seems to be no big data tool and technology that ofers all the key features required for now and standard benchmark dataset for big data streaming analytics has not been widely adopted. In conclusion, it was recommended that research eforts should be geared towards developing scalable frameworks and algorithms that will accommodate data stream computing mode, efective resource allocation strategy and parallelization issues to cope with the ever-growing size and complexity of data.

Item Type: Article
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Date Deposited: 23 Sep 2019 09:20
Last Modified: 23 Sep 2019 09:20

Actions (login required)

View Item View Item