Why Cloud Computing is Critical for a Data Scientist

Опубликовано: 01 Октябрь 2024
на канале: 365 Data Science
25,144
615

👉Sign up for Our Complete Data Science Training with 57% OFF: https://bit.ly/3sJATc9
👉 Download Our Free Data Science Career Guide: https://bit.ly/47Eh6d5

In this Introduction to Probability video, we’ll talk about the Student’s T Distribution and its characteristics.

For starters, we use the lower-case letter “t” to define a Students’ T distribution, followed by a single parameter in parenthesis, called “degrees of freedom”.

As we mentioned in the last video, it is a small sample size approximation of a Normal Distribution. In instances, where we would assume a Normal distribution were it not for the limited number of observations, we use the Students’ T distribution.

For instance, the average lap times for the entire season of a Formula 1 race follow a Normal Distribution, but the lap times for the first lap of the Monaco Grand Prix would follow a Students’ T distribution.

Now, the curve of the students’ T distribution is also bell-shaped and symmetric. However, it has fatter tails to accommodate the occurrence of values far away from the mean. That is because if such a value features in our limited data, it would be representing a bigger part of the total.

Another key difference between the Students’ T Distribution and the Normal one is that apart from the mean and variance, we must also define the degrees of freedom for the distribution…

Why cloud computing is critical for data scientists? If small companies want to level the playing field, cloud computing is critical for their data science teams.

To understand the advantages cloud computing provides when it comes to data science, let’s imagine a world with as much data as we have today, but without servers. In such an unfortunate scenario, firms would need databases that run locally, right?

So, every time when you, as a data scientist, want to engage in new analyses or refresh an existing algorithm, you’d have to transfer information to your machine from the central database, and then proceed to operate locally. This unfortunate world would have several main drawbacks...

For example, manual intervention would be necessary to retrieve data... Your machine becomes a single point of failure for the analyses you have worked on locally... Processing speed would be equivalent to the computing power of your computer... Chances are you will be able to work with a limited amount of data due to the limited computing resources at your disposal... Moreover, under this setup, you wouldn’t be able to leverage real-time data to build recommender systems or any type of machine learning algorithms that require ‘live’ data.

Doesn’t sound like the perfect scenario, does it? Well, that’s why we invented servers. And then these servers had drawbacks of their own.

Fortunately, we now have clouds. They overshadow local servers in almost every conceivable aspect. And, in fact, data scientists should be focused on developing great algorithms, testing hypothesis, taking advantage of all available data without having to wait hours to see the results of the tests they are performing and certainly without having to worry how much memory space they have left on their computer. And yes, sometimes data scientists do end up waiting for long hours for an algorithm to train, but with a cloud, they have the option to pay more and get the job done faster. That’s yet another advantage of cloud computing over servers.

► Consider hitting the SUBSCRIBE button if you LIKE the content: https://www.youtube.com/c/365DataScie...

► VISIT our website: https://bit.ly/365ds

🤝 Connect with us LinkedIn:   / 365datascience  

365 Data Science is an online educational career website that offers the incredible opportunity to find your way into the data science world no matter your previous knowledge and experience. We have prepared numerous courses that suit the needs of aspiring BI analysts, Data analysts and Data scientists.

We at 365 Data Science are committed educators who believe that curiosity should not be hindered by inability to access good learning resources. This is why we focus all our efforts on creating high-quality educational content which anyone can access online.

Check out our Data Science Career guides:    • How to Become a... (Data and AI Caree...  

#CloudComputing #DataScientist #DataScience