Data preprocessing is a crucial step in machine learning and it is very important for the accuracy of the model. Data contains noise, missing values, it is incomplete and sometimes it is in an unusable format which cannot be directly used for machine learning models. If we use questionable and dirty data what will be the final result and can the decision be trusted? That’s why we are preprocessing the data, the goal is to get more meaningful data that can be trusted.
Cvetanka is an Efficient Database Developer with a vast knowledge of high availability SQL Server solutions. An adaptable professional with a background in workflow processes, creating database objects and overseeing security tasks. She has expertise in ETL and Data warehousing, including Data management.
This talk was presented during Data Science Conference Europe 2020 as a part of the Data & AI Research track.