Rootstrap Blog

Category: Data Demystified

Total 2 Posts

Data Demystified — Data Quality

Explaining conceptually what it really means, and why it matters.

This article outlines a mental framework to organize our work around Data Quality. Referencing the well-known DIKW Pyramid, data quality is the enabler that allows us to take raw data and use it to generate information, starting from raw data.

In this piece, we’ll go over a few common scenarios, review some theory, and finally outline some advice for anyone facing this increasingly common issue.

The amount of data being generated every second is almost impossible to comprehend. Current estimates say that 294 billion emails and 65 billion WhatsApp messages are sent every single day, and all of it leaves a data trail. The world economic forum estimates that the digital universe is expected to reach 44 zettabytes by 2020. To give you an idea of what that means, take a look at the byte prefixes and remember that each one multiplies by 1000: kilo, mega, giga, tera, peta, exa, zetta.

Continue Reading

Data Demystified — DIKW model

Understanding the big picture first will set the stage for success in this journey.

Data is one of the biggest new trends in both tech and business in general. Data “experts” are quickly becoming some of the best-paid individuals in the industry, and every single company wants to surf the wave of data capabilities.

It is becoming a fundamental way of understanding the world around us. We can think of data sciences as epistemology or a way of knowing. We can think of it, about a way to approach problems and solving them.

But as with any new trend, we have to ask ourselves: what do all these buzzwords actually mean?

What is a data scientist? In short, a person who is better at statistics than any software engineer and better at software engineering than any statistician.

Continue Reading