In this article you can find explanations for statistical concepts such as Statistical hypothesis test, used for answering questions about sample data and validating assumptions. In addition, it is provided a list of concepts regarding sampling distribution. Finally, we discuss the relationship between variance and bias. Statistical hypothesis testing States
Category: Machine Learning
In this article, you will find basic information about distributions. It is expected that you have some knowledge about random variables and probability concepts such as variance, covariance, and expected value. You can find that information on Understanding Basic Statistics for Machine Learning Models – Part 1. What is a
If you want to understand machine learning algorithms, it is very important to have basic statistical knowledge to understand what is behind them. Understanding how the algorithm operates gives you the possibility of configuring the model according to what you need, as well explaining with more confidence the results obtained
Being a data scientist requires a mix of skills that anyone can develop. You only need patience, time and be willing to undergo a process of trial and error. You need to understand businesses and be able to adapt to different situations according to the business’ needs. Another important skill
Oftentimes, we are surprised by the accuracy of recommendations on what to buy on Amazon, watch on Netflix, or listen on Spotify. We feel that somehow these companies know how our brain works and monetizing this magical guessing game. They have a deep foundation on behavioral sciences, and our job
My team recently faced a brand new challenge: developing a way to classify job positions written in natural language by lots of different people. It sounds simple, but there are a few factors that made this problem hard to solve. Job positions can be ambiguous depending on language usage and
We can make predictions with machine learning by generalizing our data’s pertinent characteristics. Summarizing diverse datasets provides insight that can help produce more relevant generalizations.
Data predictions provide probabilities of future outcomes by mining and analyzing existing data, also called training data. Effective prediction is a mix of engineering, statistics, and intuition. Summarization can help by shaping this intuition. In the generalization phase, we test our training data against new data, called test data, to calculate if our model is good enough to be used in real life. These two processes simplify large multidimensional datasets, so machine learning predictions can be applied to them. This article describes how summarization leads to generalization and then prediction through a real estate example.
Why we should choose representative samples with error in mind when we build data visualizations. A brief overview of uncertain bar charts and uncertain ranked lists.
The type of data samples that populate our visualizations can add uncertainty to our results. Some common data displays like bar and pie charts work better than others for making that uncertainty understandable. This article explores how to understand our data samples and create the most suitable graphs for visualizing what they represent.
In general, the goals of data science are to understand data and generate predictive models that help us make better decisions. For a more thorough overview of data visualization, see “Data visualization and The Truthful Art.”
Healthcare AI explainability might be the most important dilemma of this century. This article explains why it will define medical outcomes for future generations.
Today’s AI algorithms provide medical recommendations by analyzing big data, but they can’t always give a reason for their conclusions other than the patterns they detect. Even though these AI-recommended solutions can’t be explained in terms of human understanding, many such treatments might improve the quality of patients’ lives and even save lives. This article discusses the controversial topic of medical explainability from a viewpoint that supports applying technological advancements to healthcare.
It’s no secret that the AI revolution has begun. I’m not the only one who believes that AI is making significant changes to our world. These quotes from some of the best-known leaders in science and technology point in the same direction
How to be prepared for the change that will transform the business landscape forever.
Worldwide access to vast amounts of data has changed the business landscape. Competitive marketing depends on knowing how to manage, process, and analyze that data. This article describes the path organizations need to take from collecting data to maximizing its use.
Today’s organizations are undergoing a challenging transformation process around their technical systems. The static software platforms that might have stored and processed a business’ data are no longer sustainable in the current web environment. Enterprises need cutting-edge technology to collect big data in real-time, analyze that data, and then get the information they need to stay competitive in today’s marketplace.