Data Pre-processing in R: Handling Missing Data

Опубликовано: 28 Сентябрь 2024
на канале: Data Professor
21,779
409

In this video, I will show you how you can handle missing data in your own data science project. This video represents the first in a multi-part series on data pre-processing in R.

🌟 Buy me a coffee: https://www.buymeacoffee.com/dataprof...

⭕ Timeline
0:33 First part in Data pre-processing series
1:11 DHFR dataset
2:41 Outline of this episode
4:08 Open up RStudio or RStudio.cloud
4:15 Let's start
4:21 1. Load in the dataset
4:59 2. Check for missing data
5:48 3. Let's make the data dirty!
5:58 The custom function na.gen()
8:38 4. Check for missing data
9:08 How does is.na(dhfr) looks like?
10:18 Let's look at rows containing NA
11:29 Let's find the NA in the data
12:45 5. Handling the missing data
12:54 5.1 Simply delete data samples containing NA
13:30 5.2 Perform imputation
16:59 Preview of next episode of this series (on Data pre-processing)

The idea for this video was suggested in a comment by Marco Festugato

📎DATA: https://raw.githubusercontent.com/dat...
📎CODE: https://github.com/dataprofessor/code...
📎SLIDES: https://github.com/dataprofessor/slid...

⭕ Playlist:
Check out our other videos in the following playlists.
✅ Data Science 101: https://bit.ly/dataprofessor-ds101
✅ Data Science YouTuber Podcast: https://bit.ly/datascience-youtuber-p...
✅ Data Science Virtual Internship: https://bit.ly/dataprofessor-internship
✅ Bioinformatics: http://bit.ly/dataprofessor-bioinform...
✅ Data Science Toolbox: https://bit.ly/dataprofessor-datascie...
✅ Streamlit (Web App in Python): https://bit.ly/dataprofessor-streamlit
✅ Shiny (Web App in R): https://bit.ly/dataprofessor-shiny
✅ Google Colab Tips and Tricks: https://bit.ly/dataprofessor-google-c...
✅ Pandas Tips and Tricks: https://bit.ly/dataprofessor-pandas
✅ Python Data Science Project: https://bit.ly/dataprofessor-python-ds
✅ R Data Science Project: https://bit.ly/dataprofessor-r-ds

⭕ Subscribe:
If you're new here, it would mean the world to me if you would consider subscribing to this channel.
✅ Subscribe: https://www.youtube.com/dataprofessor...

⭕ Recommended Tools:
Kite is a FREE AI-powered coding assistant that will help you code faster and smarter. The Kite plugin integrates with all the top editors and IDEs to give you smart completions and documentation while you’re typing. I've been using Kite and I love it!
✅ Check out Kite: https://www.kite.com/get-kite/?utm_me...

⭕ Recommended Books:
✅ Hands-On Machine Learning with Scikit-Learn : https://amzn.to/3hTKuTt
✅ Data Science from Scratch : https://amzn.to/3fO0JiZ
✅ Python Data Science Handbook : https://amzn.to/37Tvf8n
✅ R for Data Science : https://amzn.to/2YCPcgW
✅ Artificial Intelligence: The Insights You Need from Harvard Business Review: https://amzn.to/33jTdcv
✅ AI Superpowers: China, Silicon Valley, and the New World Order: https://amzn.to/3nghGrd

⭕ Stock photos, graphics and videos used on this channel:
✅ https://1.envato.market/c/2346717/628...

⭕ Follow us:
✅ Medium: http://bit.ly/chanin-medium
✅ FaceBook:   / dataprofessor  
✅ Website: http://dataprofessor.org/ (Under construction)
✅ Twitter:   / thedataprof  
✅ Instagram:   / data.professor  
✅ LinkedIn:   / chanin-nantasenamat  
✅ GitHub 1: https://github.com/dataprofessor/
✅ GitHub 2: https://github.com/chaninlab/

⭕ Disclaimer:
Recommended books and tools are affiliate links that gives me a portion of sales at no cost to you, which will contribute to the improvement of this channel's contents.

#dataprofessor #machinelearning #datapreprocessing #preprocessing #missingdata #datamissing #cleandata #datacleaning #cleaningdata #preprocessingdata #datascienceproject #learnr #rprogramming #learnrprogramming #datascience #datamining #bigdata #datascienceworkshop #dataminingworkshop #dataminingtutorial #datasciencetutorial #ai #artificialintelligence #r