Data Quality: What Every Business Professional Should Know
Businesspeople have all kinds of use cases for data, and they need data that is suited to each particular use case.
The term “data quality” is used extensively by technical professionals working in the data, AI, and information technology industries, but it can be difficult to understand what these professionals mean by it. This is important because businesspeople are increasingly involved in using data, and they need to know they can trust it. To do so, businesspeople rely to some extent on the technical professionals telling them that it is “good quality” data. But frustratingly, “good quality” data may not always be usable in a particular business context.
Also, the term “data quality” can refer to many different specific things, which adds to the confusion that businesspeople can feel when they hear about it from the technical folks.
Let’s take a high-level look at what “data quality” can mean.
The Scope of Data Quality
The first thing that we need to realize is that when a technical folk talk about data quality, the scope of what they are thinking about is limited to:
(a) Data inside computerized environments
(b) Data in environments controlled by the Information Technology (IT) department
(c) Structured data, meaning data represented in rows and columns, like database
tables.
It may sound like a wide scope, but it is not. There are plenty of issues with collecting data before it even gets into a computerized system. For instance, we are quite familiar with the confusing Unemployment Rate and Jobs numbers reported by the US government every month. The methodologies used to gather this information are questionable, so the quality of the “raw data” produced is also questionable.
When it comes to activities going on outside of computerized environments, the technical folk consider it none of their business, even though the data is affected. This is not part of the data supply chain they are involved in.
Not all data is managed in systems controlled by IT. There is plenty of End User Computing (EUC) in every organization, as is seen in Excel spreadsheets. This is generally outside of what technical professionals see as their responsibilities, so they are not involved in any aspect of its quality. Such data may make its way into IT- controlled environments, and after that point (but not before it) technical staff will address quality – but only based how the data is processed in the IT-controlled environments.
Then we have structured data. Quality in unstructured data, like text, images, audio, and video, is not typically something that has traditionally been addressed from a technical perspective. So while structured data may have its data quality needs taken care of, unstructured data is left alone, which is a problem since AI is not a heavy user of unstructured data.
The Main Activities of Data Quality
“Data Quality” is not just a state of data. The term also refers to a whole subdiscipline of data management. However, this subdiscipline itself consists – confusingly – of wildly different activities which receive wildly different attention from the technical community.
These subdisciplines are:
(a) Data Issue Prevention
(b) Data Issue Detection
(c) Data Issue Management
(d) Data Issue Remediation (also known as Data Change Management)
Data Issue Prevention
Data Issue Prevention is of no particular concern to technical staff. If asked about it, they will usually say that it is taken care of by edit validations built into data entry screens in whatever system data is being entered into. As such it is nothing special, apart from being an important task in systems development. There is no idea that people and processes should be examined to determine if they are causing bad quality data. This goes back to “data quality” from the technical perspective being about what happens inside computers.
Data Issue Detection
By contrast, Data Issue Detection receives a huge amount of attention from technical staff. It seems that this is because there are a lot of software tools that have been built to detect poor quality data. Technical people tend to focus on technology as being the answer to problems, and these tools are very impressive. The tools also require technical people to run them, so technical staff are highly incentivized to be able to use them.
“Data Quality” tools, or “Data Observability” tools as they are increasingly called today, use two approaches. One is for the tool by itself to try to identify bad data. This usually requires a human to confirm it is bad. The other approach is for specific rules provided by the business to be run by the software to detect data issues. Very often, when technical people speak of “data quality” they are really thinking only of these tools.
Data Issue Management
Data Issue Management is what happens after a data quality issue is detected. The issue has to be understood and a resolution proposed for it. Some data quality issues are completely technical and it is up to technical staff to fix them. But a lot – especially issues with data content – require the business to be involved. When this happens, the activities of Data Issue Management become more people and process oriented and technical folk are more passive, waiting to be asked for information or told to do something.
However, technical staff may get involved heavily if there is ticketing software involved. This will happen if data issues are reported via a Help Desk and tickets are opened. From the technicians’ perspective it all becomes about the administration and management of the tickets at this point, rather that the substance of solving the data issue. Businesspeople can find this very frustrating.
Overall, technical staff do not usually think of Data Issue Management as part of “data quality”.
Data Issue Remediation
Data Issue Remediation is applying the resolution recommended via Data Issue Management. While some fixes can be technical in nature, others are more oriented to people and processes, and technical staff are less involved. Where a technical change is required, this flows through specific IT procedures that are for general changes in IT environments. These activities are not considered as “data quality” by technical staff.
One alarming tendency is for technical staff to implement workarounds in technical environments like systems and databases to get over a data issue. The root cause is not addressed. But for technical staff this seems natural if the root cause lies outside any computerized environment they support. It is the only thing they can do. The result is a gradual accumulation of technical fixes which nobody can remember the reasons for and which impede both technical and business change.
Data Fitness for Use
A final area where business and technical views about “data quality” diverge is fitness for use. Businesspeople have all kinds of use cases for data, and they need data that is suited to each particular use case. So it is natural to ask if the data is of the right quality for the use case. The same data may work well for one use case but not for another. Where the data does not work well, the business often sees it as a “data quality” issue. This can be puzzling to technical staff who may point out that no data quality issues have been detected in the data involved. Once again, the business and technical concepts of “data quality” are not the same.
There is a lot more that could be said on this topic, but hopefully it conveys some of the boundaries that technical staff have concerning data quality, and why communication between business and technical people about “data quality” can be difficult .



Rlly insightful framing of the disconnect betwee technical and biz perspectives on data quality. The fitness-for-use point is key, because thats where most quality initiatives fall apart, tech teams validate schema compliance while business teams care abuot contextual usability. The tendency toward technical workarounds instead of addressing root causes is exactly how data debt accumlates over time.