As machine learning developers, we always need to deal with ETL processing (Extract, Transform, Load) to get data ready for our model. Airflow can help us build ETL pipelines, and visualize the results for each of the tasks in a centralized way. In this blog post, we look at some
If you want to understand machine learning algorithms, it is very important to understand basic statistics and what is behind them. Understanding how the algorithm operates gives you the option of configuring the model according to what you need, as well as explaining with more confidence the results obtained from
In statistics and ML, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone.
Simulation can be used for emulating situations. It is used to simulate systems to be studied. This can be applied in any kind of system, banks, supermarkets, biological systems, traffic, etc. It can also be used to generate data with certain characteristics, either to complete missing information, to generate data
Whenever you want to analyze the performance of machine learning algorithms, you would need to study the root cause of the error. Concepts like bias and variance would help you understand this cause and give you insights on how to improve your model. What is Bias error? Bias error corresponds
Machine learning is one of the hottest topics nowadays. People talk about machine learning as if it is magic. Organizations are racing to integrate machine learning into their functions. Everyone talks about it, but not too many people know what it really is. It is just math and statistics plus
The data never comes perfect. There is always missing information, different formats, or it is full of useless information for your analysis. The process of data cleaning consists of the correction and transformation of the values, standardizing all the formats, fixing encoding, removing unnecessary information, splitting columns and extracting relevant
We are currently experiencing the negative impacts of the novel Coronavirus. This virus has quickly changed our lives and has left many of us feeling confused and fearful of what’s to come. As engineers and data scientists – we want to help make sense of the overwhelming amount of data
In this article you can find explanations for statistical concepts such as Statistical hypothesis test, used for answering questions about sample data and validating assumptions. In addition, it is provided a list of concepts regarding sampling distribution. Finally, we discuss the relationship between variance and bias. Statistical hypothesis testing States
In this article, you will find basic information about distributions. It is expected that you have some knowledge about random variables and probability concepts such as variance, covariance, and expected value. You can find that information on Understanding Basic Statistics for Machine Learning Models – Part 1. What is a