[DSC 4.0] New Data Science Framework for Analysing and Mining Big Data - Charith Silva

Опубликовано: 21 Март 2025
на канале: Data Science Conference
67
0

With the development of advanced remote sensing and communication technology, new sources of data began to develop in the lots of industries such as finance, marketing, transport, utility, etc.. These new types of datasets are being received continuously at a very high speed. Researchers in academia and industry have made many efforts to improve the value of big data and significant use of its value using data science. Having a good process for data mining and machine learning and clear guidelines is always plus point for any data science project. It also helps to focus required time and resources early in the process to get a clear idea of the business problem to be solved.

Hence, the framework is proposed to aid data science project lifecycle and bridge the gap with business needs and technical realities.

Main motivation of building this new framework is to address big data analysis changes and reduce the complexity of the any big data related data science projects. Recent improvements in technology demand real-time data processing and analytics and visualization to gain completive advantage of real-time decision making. After carefully examination and analysis of the related literature, there are a variety of issues in Big Data processing and analysis. Therefore, this research present new Big Data analytics and processing framework for data acquisition, data fusion, data storing, managing, processing, analysing, visualising and modelling. Often the purpose of data analysis is not only to identify pattern, but to build models, if possible by gaining an understanding of process. We believe that without a proper coordination and structuring framework there is likely to be much overlap and duplication amongst project phases, and can cause confusion around the responsibilities of each project participant. A common mistake made in big data projects is rushing into data collection and data analysis, which prevents spending adequate time to plan the amount of work involved in the project, understanding business requirements, or even defining the business problem properly. Big data has is available all around us in various formats, shapes and sizes. Understanding the relevance of each of these data sets to business problem is a key aspect to succeed with the project. Also, big data has multiple layers of hidden complexity that are not visible by simply inspecting. Poorly planned project can ruin entire project and the finding of the project in any organization. If the project does not clearly identify the appropriate level of complexity and the granularity, then the chances are high an erroneous result set will occur that twists the expected analytical outputs.


This talk was presented by Mr. Charith Silva, Data Scientist at University of Salford-Manchester, during Data Science Conference 4.0, as a part of 4th Industrial Revolution track.

You can find this talk presentation on the following link:
https://www.slideshare.net/Insitute_o...

More info about Data Science Conference:
Website: http://datasciconference.com
Instagram:   / datasciconf  
Facebook:   / datasciconference  
Twitter:   / datasciconf  
Flickr: https://www.flickr.com/photos/data-sc...

To watch more new videos regarding Data Science - click subscribe to our YouTube Channel.