For those of us working with data, we are all too familiar with how difficult it can be to fully understand what we are presented with. No matter how complex, data can be difficult for us to digest and make sense of. Data visualization is an effective technique that can
In data science, the theory in practice is not always the same as reality. When working with data, it’s not uncommon to be presented with several complex problems. Fortunately, you are not alone and there are blogs, slack channels, and useful information to come to the rescue. Plenty of problems
As machine learning developers, we always need to deal with ETL processing (Extract, Transform, Load) to get data ready for our model. Airflow can help us build ETL pipelines, and visualize the results for each of the tasks in a centralized way. In this blog post, we look at some
If you want to understand machine learning algorithms, it is very important to understand basic statistics and what is behind them. Understanding how the algorithm operates gives you the option of configuring the model according to what you need, as well as explaining with more confidence the results obtained from
In statistics and ML, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone.
Simulation can be used for emulating situations. It is used to simulate systems to be studied. This can be applied in any kind of system, banks, supermarkets, biological systems, traffic, etc. It can also be used to generate data with certain characteristics, either to complete missing information, to generate data
Whenever you want to analyze the performance of machine learning algorithms, you would need to study the root cause of the error. Concepts like bias and variance would help you understand this cause and give you insights on how to improve your model. What is Bias error? Bias error corresponds
Machine learning is one of the hottest topics nowadays. People talk about machine learning as if it is magic. Organizations are racing to integrate machine learning into their functions. Everyone talks about it, but not too many people know what it really is. It is just math and statistics plus
The data never comes perfect. There is always missing information, different formats, or it is full of useless information for your analysis. The process of data cleaning consists of the correction and transformation of the values, standardizing all the formats, fixing encoding, removing unnecessary information, splitting columns and extracting relevant
We are currently experiencing the negative impacts of the novel Coronavirus. This virus has quickly changed our lives and has left many of us feeling confused and fearful of what’s to come. As engineers and data scientists – we want to help make sense of the overwhelming amount of data