Rootstrap Blog

Data Cleaning

The data never comes perfect. There is always missing information, different formats, or it is full of useless information for your analysis. The process of data cleaning consists of the correction and transformation of the values, standardizing all the formats, fixing encoding, removing unnecessary information, splitting columns and extracting relevant

Continue Reading

Understanding Basic Statistics for Machine Learning Models – Part 3

In this article you can find explanations for statistical concepts such as Statistical hypothesis test, used for answering questions about sample data and validating assumptions. In addition, it is provided a list of concepts regarding sampling distribution. Finally, we discuss the relationship between variance and bias. Statistical hypothesis testing States

Continue Reading

Understanding Basic Statistics for Machine Learning Models – Part 2

In this article, you will find basic information about distributions. It is expected that you have some knowledge about random variables and probability concepts such as variance, covariance, and expected value. You can find that information on Understanding Basic Statistics for Machine Learning Models – Part 1.  What is a

Continue Reading

Understanding Basic Statistics for Machine Learning Models – Part 1

If you want to understand machine learning algorithms, it is very important to have basic statistical knowledge to understand what is behind them. Understanding how the algorithm operates gives you the possibility of configuring the model according to what you need, as well explaining with more confidence the results obtained

Continue Reading

Skills For Data Scientists

Being a data scientist requires a mix of skills that anyone can develop. You only need patience, time and be willing to undergo a process of trial and error. You need to understand businesses and be able to adapt to different situations according to the business’ needs. Another important skill

Continue Reading

What is Data Science

Why should we care about Data Science?  Nowadays more and more data is being generated by smartphones, social media, health, banks, stores, online services, governments, sensors, etc. Every piece of information is saved ‘just in case’. Thus, the available data cannot be processed by human’s brains, we need algorithms and

Continue Reading