<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Anchoring Data]]></title><description><![CDATA[This publication is dedicated to providing relevant information on data. Let's stay grounded in the ever-changing world of data.
]]></description><link>https://anchoringdata.com</link><image><url>https://substackcdn.com/image/fetch/$s_!WExa!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F736d30ff-6b2c-4ee5-8797-749f5226e152_312x312.png</url><title>Anchoring Data</title><link>https://anchoringdata.com</link></image><generator>Substack</generator><lastBuildDate>Sat, 04 Apr 2026 10:27:06 GMT</lastBuildDate><atom:link href="https://anchoringdata.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Malcolm Chisholm]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[malcolmchisholm2@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[malcolmchisholm2@substack.com]]></itunes:email><itunes:name><![CDATA[Malcolm Chisholm]]></itunes:name></itunes:owner><itunes:author><![CDATA[Malcolm Chisholm]]></itunes:author><googleplay:owner><![CDATA[malcolmchisholm2@substack.com]]></googleplay:owner><googleplay:email><![CDATA[malcolmchisholm2@substack.com]]></googleplay:email><googleplay:author><![CDATA[Malcolm Chisholm]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Your AI Has Instructions. Who’s Governing Them? (Part 2) ]]></title><description><![CDATA[This article is a conclusion to the one published last week that introduced the need for governance of the instructions given to AI.]]></description><link>https://anchoringdata.com/p/your-ai-has-instructions-whos-governing-3aa</link><guid isPermaLink="false">https://anchoringdata.com/p/your-ai-has-instructions-whos-governing-3aa</guid><dc:creator><![CDATA[Malcolm Chisholm]]></dc:creator><pubDate>Fri, 06 Mar 2026 12:03:05 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!p8Vy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4658cb5-ecbc-49be-b376-2af8cbb95ebd_1200x675.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>See <strong><a href="https://anchoringdata.com/p/your-ai-has-instructions-whos-governing">Your AI Has Instructions. Who&#8217;s Governing Them? </a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!p8Vy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4658cb5-ecbc-49be-b376-2af8cbb95ebd_1200x675.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!p8Vy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4658cb5-ecbc-49be-b376-2af8cbb95ebd_1200x675.png 424w, https://substackcdn.com/image/fetch/$s_!p8Vy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4658cb5-ecbc-49be-b376-2af8cbb95ebd_1200x675.png 848w, https://substackcdn.com/image/fetch/$s_!p8Vy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4658cb5-ecbc-49be-b376-2af8cbb95ebd_1200x675.png 1272w, https://substackcdn.com/image/fetch/$s_!p8Vy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4658cb5-ecbc-49be-b376-2af8cbb95ebd_1200x675.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!p8Vy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4658cb5-ecbc-49be-b376-2af8cbb95ebd_1200x675.png" width="604" height="339.75" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a4658cb5-ecbc-49be-b376-2af8cbb95ebd_1200x675.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:675,&quot;width&quot;:1200,&quot;resizeWidth&quot;:604,&quot;bytes&quot;:229237,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/190053304?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4658cb5-ecbc-49be-b376-2af8cbb95ebd_1200x675.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!p8Vy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4658cb5-ecbc-49be-b376-2af8cbb95ebd_1200x675.png 424w, https://substackcdn.com/image/fetch/$s_!p8Vy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4658cb5-ecbc-49be-b376-2af8cbb95ebd_1200x675.png 848w, https://substackcdn.com/image/fetch/$s_!p8Vy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4658cb5-ecbc-49be-b376-2af8cbb95ebd_1200x675.png 1272w, https://substackcdn.com/image/fetch/$s_!p8Vy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4658cb5-ecbc-49be-b376-2af8cbb95ebd_1200x675.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For a long time AI Governance has focused on tools and organizational roles. These are important, but the &#8220;micro&#8221; level of what is actually driving AI solutions to behave the way they do has received relatively little attention. That has to change, and systems prompts that control AI behavior must be fully governed, and not thought of as technical artifacts left to technical staff to sort out.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Anchoring Data! This is a publication dedicated to shine a light on knowledge about what data is and how it is managed. <em>Please subscribe and join me in this data journey.</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h4>Why This Matters Now</h4><p>Three forces are converging to make system prompt governance urgent:&nbsp;</p><p><strong>Scale: </strong>Organizations are moving from a handful of AI experiments to dozens or hundreds of AI-powered systems. At that scale, ad hoc management breaks down completely, and small risks become significant.</p><p><strong>Regulation:</strong> The EU AI Act, emerging U.S. state legislation, and industry-specific regulations are increasingly requiring organizations to demonstrate oversight and accountability for AI system behavior. &#8220;We didn&#8217;t know what was in the prompt&#8221; will not be an acceptable defense.</p><p><strong>Complexity: </strong>Modern AI systems don&#8217;t just respond to users - they use tools, access databases, trigger workflows, pay for goods and services, and make decisions. The system prompt governing such an agent is not a paragraph of text; it&#8217;s an operational control document. Treating it casually is like treating the operating manual for a nuclear reactor casually.</p><h4>What You Can Do This Week</h4><p>If you&#8217;re a data governance professional, an AI leader, or anyone responsible for AI systems in your organization, here are four concrete steps:</p><ol><li><p>Understand. Learn what system prompts are, why they are so important. Provide</p><p>awareness and education on them. Develop principles for their governance.</p></li><li><p>Take inventory. How many system prompts does your organization have in</p><p>production? Who wrote them? When were they last reviewed? Most organizations</p><p>cannot answer these questions, and that alone will be alarming when exposed. This will provide more than enough justification for investment in governance.&nbsp;</p></li><li><p>Pick your highest-risk prompt. Choose the one governing the AI system that</p><p>touches the most customers, handles the most sensitive data, or operates in the most regulated context. Assess it against the seven pillars above. The gaps you find will be instructive (at a minimum).</p></li><li><p>Start the conversation. Share this article with your data governance council,</p><p>your AI/ML team, or your risk management colleagues. The most important step is getting the right people in the room and recognizing that system prompts are enterprise assets that deserve even more governance rigor than we apply to data.&nbsp;</p></li></ol><h4>Where We Go from Here</h4><p>A lot more thought needs to go into system prompt governance. There seems to be a spectrum of different contexts that need to be woven into system prompts. At the highest level every organization needs a Constitution that outlines its core principles for AI that cannot be sidestepped. These are absolutes. Then there are convictions that operate in particular contexts, and then requirements for what an AI application is to do, and finally preferences about how it might operate, but preferences that allow some leeway. It is more complex than this, but such a schema is a useful way of thinking about how to govern system prompts.</p><p>I&#8217;ll be writing more about System Prompt Governance in upcoming editions of&nbsp;<strong>Anchoring Data</strong>,&nbsp;including detailed breakdowns of each pillar, real-world assessment templates, and practical implementation guides. If this resonates with you, I&#8217;d love to hear your perspective &#8212; where does your organization stand on prompt governance?</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/p/your-ai-has-instructions-whos-governing-3aa/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://anchoringdata.com/p/your-ai-has-instructions-whos-governing-3aa/comments"><span>Leave a comment</span></a></p><p>If you&#8217;d like to learn more about my background and what led me to write this publication, I invite you to click below to learn more about my journey.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/p/the-authors-data-journey?r=4733dz&amp;utm_campaign=post&amp;utm_medium=web&amp;triedRedirect=true&quot;,&quot;text&quot;:&quot;About The Author&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://anchoringdata.com/p/the-authors-data-journey?r=4733dz&amp;utm_campaign=post&amp;utm_medium=web&amp;triedRedirect=true"><span>About The Author</span></a></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Your AI Has Instructions. Who’s Governing Them? (Part 1)]]></title><description><![CDATA[Why System Prompt Governance Is the Next Frontier &#8212; and Why Data Governance Professionals Should Lead It]]></description><link>https://anchoringdata.com/p/your-ai-has-instructions-whos-governing</link><guid isPermaLink="false">https://anchoringdata.com/p/your-ai-has-instructions-whos-governing</guid><dc:creator><![CDATA[Malcolm Chisholm]]></dc:creator><pubDate>Thu, 26 Feb 2026 12:02:42 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!jfuM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7431f54b-3f1e-49ef-8996-b0f6186c105a_1800x1074.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jfuM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7431f54b-3f1e-49ef-8996-b0f6186c105a_1800x1074.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jfuM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7431f54b-3f1e-49ef-8996-b0f6186c105a_1800x1074.png 424w, https://substackcdn.com/image/fetch/$s_!jfuM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7431f54b-3f1e-49ef-8996-b0f6186c105a_1800x1074.png 848w, https://substackcdn.com/image/fetch/$s_!jfuM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7431f54b-3f1e-49ef-8996-b0f6186c105a_1800x1074.png 1272w, https://substackcdn.com/image/fetch/$s_!jfuM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7431f54b-3f1e-49ef-8996-b0f6186c105a_1800x1074.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jfuM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7431f54b-3f1e-49ef-8996-b0f6186c105a_1800x1074.png" width="636" height="379.59065934065933" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7431f54b-3f1e-49ef-8996-b0f6186c105a_1800x1074.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f54cd8ff-fe14-48a9-95f7-6ad92d3fedc0_1800x1074.jpeg&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:869,&quot;width&quot;:1456,&quot;resizeWidth&quot;:636,&quot;bytes&quot;:155034,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/189201410?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff54cd8ff-fe14-48a9-95f7-6ad92d3fedc0_1800x1074.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jfuM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7431f54b-3f1e-49ef-8996-b0f6186c105a_1800x1074.png 424w, https://substackcdn.com/image/fetch/$s_!jfuM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7431f54b-3f1e-49ef-8996-b0f6186c105a_1800x1074.png 848w, https://substackcdn.com/image/fetch/$s_!jfuM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7431f54b-3f1e-49ef-8996-b0f6186c105a_1800x1074.png 1272w, https://substackcdn.com/image/fetch/$s_!jfuM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7431f54b-3f1e-49ef-8996-b0f6186c105a_1800x1074.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Every AI system you interact with &#8212; every chatbot, every copilot, every autonomous agent &#8212; operates under a set of instructions called a system prompt. This prompt tells the AI what it is, what it should do, how it should behave, what it must avoid, and how it should handle situations it wasn&#8217;t explicitly designed for.</p><p>System prompts are, in effect, executable policy documents for an AI system. They constrain the behavior of the AI system, just as real policies constrain business behavior. They are also the bridge between what an organization wants its AI to do and what the AI actually does.</p><p>And right now, almost nobody is governing them.</p><h4>More Than &#8220;Yet Another Ungoverned Asset&#8221;</h4><p>If you&#8217;ve spent time in data governance, this situation will feel eerily familiar. </p><p>Twenty years ago, most organizations had data everywhere - in databases, spreadsheets, reports, applications &#8212; and almost no one was asking fundamental questions: Who owns this data? What does this data element actually mean? How do we know it is accurate? What happens when it changes? Who approved this definition?</p><p>Back then, everyone knew data was important. But it was treated more as a byproduct of automation rather than as an asset requiring proactive management. </p><p>That is where we are with system prompts today. Overwhelmingly, they are being written by developers or other technical staff, deployed into production, and left to run - with no formal ownership, no lifecycle management, no quality standards, no version control discipline, and no systematic handling of edge cases. Most organizations probably do not even know how many system prompts they have in production, let alone whether those prompts are consistent with each other or aligned with business objectives, or use cases, or enterprise AI policies.</p><p>This is system prompt sprawl, and it carries real consequences.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Anchoring Data! This is a publication dedicated to shine a light on knowledge about what data is and how it is managed. <em>Please subscribe and join me in this data journey.</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h4>What Can Go Wrong (and Already Has)</h4><p>The risks of ungoverned system prompts are not theoretical. Some examples:</p><p>&#8226; AI systems making unauthorized promises to customers because the prompt did not explicitly prohibit commitments.</p><p>&#8226; Chatbots providing regulated advice (financial, medical, legal) without appropriate disclaimers, because nobody with compliance expertise reviewed the prompt.</p><p>&#8226; Prompt injection attacks that manipulated AI systems into revealing internal instructions or bypassing safety controls, because security boundaries were not included.</p><p>&#8226; Inconsistent customer experiences across channels because different engineering teams wrote different prompts for similar use cases with no coordination.</p><p>&#8226; Users being insulted by AI systems because system prompts lacked proper conversational guardrails.</p><p>&#8226; AI systems that &#8220;hallucinated&#8221; confidently about topics they should have refused to provide responses on, because the prompt never defined what was out of scope.</p><p> Each of these failures traces back to the same root cause: the system prompt was not governed.</p><h4>What Does &#8220;Governing&#8221; a System Prompt Actually Mean?</h4><p>This is the critical question, and it&#8217;s where I think many people&#8217;s thinking stops too early.</p><p>When people hear &#8220;prompt governance,&#8221; they tend to think of security &#8212; preventing jailbreaks and injection attacks. Security is important, but just like in Data Governance it is only one dimension. Genuine prompt governance is much broader. I&#8217;ve developed what I&#8217;m calling the <em>System Prompt Governance Framework (SPGF)</em>, organized around seven pillars:</p><p><strong>1. Definition and Specification: </strong>Before writing a single word of prompt text, the author need a clear articulation of what the prompt must accomplish, for whom, and why. This means a formal specification document &#8212; much like a business requirements document &#8212; that captures purpose, target users, functional requirements, success criteria, and critically, precise definitions of key terms. When your prompt says the AI should be &#8220;helpful&#8221; or &#8220;professional&#8221; or &#8220;brief,&#8221; what do those words actually mean in operational terms? Data governance professionals have been fighting this battle around business glossaries for years. The same discipline applies here.</p><p><strong>2. Use Case Alignment: </strong>Every prompt exists to serve a specific use case. The prompt&#8217;s instructions, tone, capabilities, and constraints must be precisely calibrated to the context &#8212; not generically written and deployed across different situations. A customer service chatbot for a bank has fundamentally different requirements than an internal knowledge assistant for an engineering team.</p><p><strong>3. Guardrails and Boundaries: </strong>This encompasses content boundaries (what the AI must not discuss), behavioral boundaries (actions it must not take), regulatory compliance, ethical constraints, and security defenses. Guardrails are not about making the AI less useful &#8211; they are about defining the operating envelope within which the AI can be maximally useful without creating risk.</p><p><strong>4. Edge Case and Exception Handling: </strong>Most prompts are written for the happy path. What happens when a user asks about something outside the AI&#8217;s scope? When input is ambiguous? When someone deliberately tries to manipulate the system? When an external tool the AI relies on fails? When a user is in emotional distress? These situations require explicit instructions, not improvisation by the AI.</p><p><strong>5. Lifecycle Management: </strong>System prompts are living documents. They need version control, change management, testing protocols, deployment processes, monitoring for behavioral drift, and eventually, retirement. A prompt that was appropriate six months ago may no longer be aligned with current business requirements, regulatory changes, or the capabilities of the underlying model.</p><p><strong>6. Roles and Accountability:</strong>  Who owns each prompt? Who is responsible for its quality? Who performs the lifecycle management? Who reviews changes? Who audits the portfolio? Without named accountability, governance is just documentation that nobody reads.</p><p><strong>7. Enterprise Level:</strong> While there is plenty of governance needed at the individual prompt level, there is also an enterprise level that needs to be put in place. An overall policy for system prompt governance is needed. Education is required, so staff do not confuse system prompt governance with prompt engineering. And inconsistencies between different system prompts cannot be allowed &#8211; remember, they are like mini- policies. This required centralized management of a portfolio or catalog of system prompts.</p><h4>Data Governance Principles Apply Directly</h4><p>Here&#8217;s what excites me most about this space: we don&#8217;t need to invent governance from scratch. The principles that data governance professionals have refined over decades map almost perfectly to system prompt governance. Let&#8217;s compare them:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eac2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ff0fc8-472f-4287-bf10-bc0b77802ce9_1226x568.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eac2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ff0fc8-472f-4287-bf10-bc0b77802ce9_1226x568.png 424w, https://substackcdn.com/image/fetch/$s_!eac2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ff0fc8-472f-4287-bf10-bc0b77802ce9_1226x568.png 848w, https://substackcdn.com/image/fetch/$s_!eac2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ff0fc8-472f-4287-bf10-bc0b77802ce9_1226x568.png 1272w, https://substackcdn.com/image/fetch/$s_!eac2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ff0fc8-472f-4287-bf10-bc0b77802ce9_1226x568.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eac2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ff0fc8-472f-4287-bf10-bc0b77802ce9_1226x568.png" width="618" height="286.31647634584016" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b1ff0fc8-472f-4287-bf10-bc0b77802ce9_1226x568.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:568,&quot;width&quot;:1226,&quot;resizeWidth&quot;:618,&quot;bytes&quot;:68211,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/189201410?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ff0fc8-472f-4287-bf10-bc0b77802ce9_1226x568.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eac2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ff0fc8-472f-4287-bf10-bc0b77802ce9_1226x568.png 424w, https://substackcdn.com/image/fetch/$s_!eac2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ff0fc8-472f-4287-bf10-bc0b77802ce9_1226x568.png 848w, https://substackcdn.com/image/fetch/$s_!eac2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ff0fc8-472f-4287-bf10-bc0b77802ce9_1226x568.png 1272w, https://substackcdn.com/image/fetch/$s_!eac2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ff0fc8-472f-4287-bf10-bc0b77802ce9_1226x568.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The domain of system prompts is new, much of what is required at least rhymes with what data governance has done in the past. Yes, some aspects are genuinely new, but we at least have a start.</p><h4>A Maturity Model for Where You Stand</h4><p>I&#8217;ve structured a five-level maturity model for prompt governance:</p><p><em>Level 1 &#8212; Ad Hoc:</em> Prompts are written by individual developers with no standards or documentation. This is where most organizations are today.</p><p><em>Level 2 &#8212; Aware: </em>Basic documentation exists for some prompts. People recognize that prompt quality matters, but processes are informal. Some prompts are shared for best practices.</p><p><em>Level 3 &#8212; Defined: </em>Formal governance processes are established. All seven pillars are addressed. Roles are assigned. Lifecycle management is operational.</p><p><em>Level 4 &#8212; Managed:</em> Prompt quality is measured quantitatively. Testing is systematic. The prompt portfolio is managed for cross-prompt consistency.</p><p><em>Level 5 &#8212; Optimizing: </em>Continuous improvement driven by data. Prompt governance is embedded in organizational culture and contributes measurably to AI performance.</p><p>The enterprise level of prompt governance is fully established. Most organizations would honestly assess themselves at Level 1, with some pockets of Level 2. The gap between where they are and where they need to be represents both a risk and an opportunity.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/p/your-ai-has-instructions-whos-governing?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://anchoringdata.com/p/your-ai-has-instructions-whos-governing?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>If you&#8217;d like to learn more about my background and what led me to write this publication, I invite you to click below to learn more about my journey.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/p/the-authors-data-journey?r=4733dz&amp;utm_campaign=post&amp;utm_medium=web&amp;triedRedirect=true&quot;,&quot;text&quot;:&quot;About The Author&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://anchoringdata.com/p/the-authors-data-journey?r=4733dz&amp;utm_campaign=post&amp;utm_medium=web&amp;triedRedirect=true"><span>About The Author</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[A Business Understanding of Data Policies]]></title><description><![CDATA[Policies are the laws of an organization. They are not detailed rules specifying how to do things, but provide a framework of what must be done.]]></description><link>https://anchoringdata.com/p/a-business-understanding-of-data-42e</link><guid isPermaLink="false">https://anchoringdata.com/p/a-business-understanding-of-data-42e</guid><dc:creator><![CDATA[Malcolm Chisholm]]></dc:creator><pubDate>Sat, 07 Feb 2026 13:30:12 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!W8Z-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a30157f-0dd1-4aff-ae00-7359ffa7c83d_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!W8Z-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a30157f-0dd1-4aff-ae00-7359ffa7c83d_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!W8Z-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a30157f-0dd1-4aff-ae00-7359ffa7c83d_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!W8Z-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a30157f-0dd1-4aff-ae00-7359ffa7c83d_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!W8Z-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a30157f-0dd1-4aff-ae00-7359ffa7c83d_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!W8Z-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a30157f-0dd1-4aff-ae00-7359ffa7c83d_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!W8Z-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a30157f-0dd1-4aff-ae00-7359ffa7c83d_1536x1024.png" width="546" height="364.125" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0a30157f-0dd1-4aff-ae00-7359ffa7c83d_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:546,&quot;bytes&quot;:2773363,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/187142361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a30157f-0dd1-4aff-ae00-7359ffa7c83d_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!W8Z-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a30157f-0dd1-4aff-ae00-7359ffa7c83d_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!W8Z-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a30157f-0dd1-4aff-ae00-7359ffa7c83d_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!W8Z-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a30157f-0dd1-4aff-ae00-7359ffa7c83d_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!W8Z-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a30157f-0dd1-4aff-ae00-7359ffa7c83d_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Data has been important for decades, and this is accelerating in the age of AI because data is the fuel that drives AI.</p><p>While a lot of heads would nod in agreement, the reality is that data is not managed as well as it should be. We can easily find examples of poor data quality, duplicate copies of data that incrementally add to storage costs, data that is not correctly understood and interpreted, personal data that is mishandled, and so on.</p><p>This is where policies around data come in. Without data policies, staff manage data only if they realize they need to, and in ad hoc ways that vary widely. Perhaps some of these practices are good, but often they are not, and many times required practices are simply absent. This is not to blame anyone; it is just the way things have evolved.</p><h4>Technology is Not the Answer</h4><p>But data policies are often simply not developed in many organizations, or only policies absolutely needed for compliance are.</p><p>In part this seems to be due to an expectation that the technology housing and processing data will deal with all these management needs. But humans interact with data frequently. We get to decide what data goes into computerized systems, what the data means, how it is input, how it is processed, how it needs to be secured, what we do with the outputs of these systems, how data is eventually disposed of, and many other tasks. In other words, there is a great deal of primarily human behavior around data that is only indirectly related to technology.</p><p>Perhaps these behaviors are sufficient for our private lives, but in the organizations we work for it is a different matter. All modern enterprises are highly reliant on data, and staff in one part of the organization are often reliant on data from another part. This complexity increases the need for good data management by all staff interacting with the data.</p><p>It is the human aspect that drives the need for policies. Data policies seek to improve human behaviors around data in organizations so that a reasonable level of good data management practices is reached. This cannot be done with technology because the human interactions are never going to be replaced by automation. People are always going to interact with data.</p><h4>The Weakness of Centralized Data Governance</h4><p>Another feature of data is that it is everywhere in every organization. This is recognized by executives who have invested resources and authority in setting up centralized Data Governance offices and the like.</p><p>But such resources do not scale to the problem. A Data Governance office will likely have a handful of full-time employees, ranging to a few dozen in a very large organization. Compare this to thousands or tens of thousands of employees in these organizations who are interacting with data, and who need guidance to avoid creating &#8220;data messes&#8221;.</p><h4>What Is a Policy?</h4><p>This is where data policies come in. Data Policies can drive best behaviors of staff when they interact with data. Policies are the laws of an organization. They are not detailed rules specifying how to do things, but provide a framework of what must be done. We can define a policy as:</p><p><em>A high-level imperative that controls business behavior. It supports one or more principles. A policy specifies what to do, but not how to do it. A policy is enforceable and enforced.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Anchoring Data! This is a publication dedicated to shine a light on knowledge about what data is and how it is managed. <em>Please subscribe and join me in this data journey.</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h4>Breaking Down the Policy Definition</h4><p>My definition of policy contains several key points:</p><p>1. A policy is an imperative &#8211; a command &#8211; that tells people to do something: This does not mean it is written in harsh language, but it must be followed.</p><p>2. A policy controls business behavior: That is, enterprise staff are going to have to do something to be in compliance with the policy. Automation may be involved, but the human element is primary.</p><p>3. A policy is aligned to principles: Principles are logically prior to policies. If we have a policy that does not align to any of our stated principles, then we have de facto principles that we are not articulating, but upon which the policy is based. If we have principles that are not related to any policies, then we may suspect that we are not really serious about these principles (although some principles may genuinely have no policies related to them).</p><p>4. A policy must not tell people how to do something: It cannot anticipate every situation and circumstance. Instead, it must specify what must be done and leave it up to the readers to figure out how to implement the policy. It is possible to specify courses of action, but these are practices, standards, and procedures.</p><p>5. A policy is enforceable: It is not a theoretical document that is put somewhere for people to read if they happen to be interested. It is not a set of suggestions. Moreover, there is a mechanism by which the policy can be enforced, and this mechanism is put in place before the policy is released to the enterprise.</p><p>6. A policy is enforced: That is, the mechanism that could be used for enforcement actually is used for enforcement. This point is often under-appreciated. Enforcement requires action. People who just want to write a policy and be done with it have to face up to fact that they will participate in enforcement. Enforcement does not usually mean punishment (though it sometimes may). Rather, it is detecting out of compliance conditions and then working with the areas where these are found to fix the problem. To some extent it is more like providing support &#8211; except the context is a mandatory one. Any support requires resources and the Data Governance units issuing data policies must ensure they have the necessary resources to provide support their policies.</p><h4>What A Data Policy is Not</h4><p>A data policy cannot be any of the following:</p><ul><li><p><strong>Educational Material:</strong> A policy does not seek to educate, that is, explain the concepts involved. Of course, data governance and data management are complex</p><p>areas and some kind of education may be required for staff to understand a data</p><p>policy. But this education must not be part of the data policy. There is nothing</p><p>imperative in education and it will confuse the reader if it is included in the policy.</p></li><li><p><strong>Training:</strong> This too cannot be part of a data policy. It will be too much like a</p><p>procedure, and will appear to prescribe how to implement the policy.</p></li><li><p><strong>Guidelines:</strong> These are optional advice. Policies are not optional.</p></li><li><p><strong>Best Practices:</strong> These are lessons from outside the enterprise that have been</p><p>documented as being successful. Despite this, they are often not specific enough</p><p>for a particular organization. Some people like to try to identify them and adopt them so they do not have to do much thinking. But they are not policies.</p></li></ul><p>Very confusingly, the word &#8220;policy&#8221; is used in a completely different within IT in the areas that deal with access control. Here &#8220;policy&#8221; means a low-level, specific rule for permitting access. As we have seen, policies are not detailed rules. The unfortunate result is that when IT professionals and the business discuss data policies they typically mean very different things and the conversations can be incredibly confusing.</p><h4>Data Policies and the Business</h4><p>Data policies seek to drive good behaviors around data in organizations where there may be very few Data Governance professionals. This does not mean businesspeople should be passive when it comes to data policies. Businesspeople can and should suggest where there are needs for data policies. These needs may not be obvious to a central Data Governance organization. Also, businesspeople should push back if a data policy is unreasonable or simply impossible to implement. Of course, this must be done only in more extreme circumstances, and not merely because a policy is inconvenient in some way. Businesspeople should also ask to be consulted and informed about proposed new data policies or changes to existing policies. In short, the business should become as engaged as possible in the management of the data policies which will ultimately influence the way the business works.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share Anchoring Data&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://anchoringdata.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share Anchoring Data</span></a></p><p>If you&#8217;d like to learn more about my background and what led me to write this publication, I invite you to click below to learn more about my journey. </p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/p/the-authors-data-journey?r=4733dz&amp;utm_campaign=post&amp;utm_medium=web&amp;triedRedirect=true&quot;,&quot;text&quot;:&quot;About The Author&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://anchoringdata.com/p/the-authors-data-journey?r=4733dz&amp;utm_campaign=post&amp;utm_medium=web&amp;triedRedirect=true"><span>About The Author</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ubeG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff80bad06-c934-4f9b-92ef-d414b4827df2_3780x1890.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ubeG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff80bad06-c934-4f9b-92ef-d414b4827df2_3780x1890.png 424w, https://substackcdn.com/image/fetch/$s_!ubeG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff80bad06-c934-4f9b-92ef-d414b4827df2_3780x1890.png 848w, https://substackcdn.com/image/fetch/$s_!ubeG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff80bad06-c934-4f9b-92ef-d414b4827df2_3780x1890.png 1272w, https://substackcdn.com/image/fetch/$s_!ubeG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff80bad06-c934-4f9b-92ef-d414b4827df2_3780x1890.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ubeG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff80bad06-c934-4f9b-92ef-d414b4827df2_3780x1890.png" width="1456" height="728" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f80bad06-c934-4f9b-92ef-d414b4827df2_3780x1890.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:728,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1426964,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/187142361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff80bad06-c934-4f9b-92ef-d414b4827df2_3780x1890.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ubeG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff80bad06-c934-4f9b-92ef-d414b4827df2_3780x1890.png 424w, https://substackcdn.com/image/fetch/$s_!ubeG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff80bad06-c934-4f9b-92ef-d414b4827df2_3780x1890.png 848w, https://substackcdn.com/image/fetch/$s_!ubeG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff80bad06-c934-4f9b-92ef-d414b4827df2_3780x1890.png 1272w, https://substackcdn.com/image/fetch/$s_!ubeG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff80bad06-c934-4f9b-92ef-d414b4827df2_3780x1890.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.amazon.com/Successful-Sustainable-Data-Policies-Governance/dp/1634626095/ref=sr_1_1?crid=18GT02UQXUTF5&amp;dib=eyJ2IjoiMSJ9.E6l49KPSL3WUyOq0weLv_CKc2SjXN-bnewfWCXlfWFjBRta9NBqyQCYeAbPNmHTtFVeCUWrzU7QswB2Yy0G8Wa3Pew1T4laTHxuvQskEqvk.4Kfp7CR8JL34uLCp_En13H4SHmSZCSNe_pSqoUBgBvI&amp;dib_tag=se&amp;keywords=successful+and+sustainable+data+policies&amp;qid=1770417969&amp;sprefix=succesful+and+sustainable+data+policie%2Caps%2C210&amp;sr=8-1&quot;,&quot;text&quot;:&quot;Get The Book&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.amazon.com/Successful-Sustainable-Data-Policies-Governance/dp/1634626095/ref=sr_1_1?crid=18GT02UQXUTF5&amp;dib=eyJ2IjoiMSJ9.E6l49KPSL3WUyOq0weLv_CKc2SjXN-bnewfWCXlfWFjBRta9NBqyQCYeAbPNmHTtFVeCUWrzU7QswB2Yy0G8Wa3Pew1T4laTHxuvQskEqvk.4Kfp7CR8JL34uLCp_En13H4SHmSZCSNe_pSqoUBgBvI&amp;dib_tag=se&amp;keywords=successful+and+sustainable+data+policies&amp;qid=1770417969&amp;sprefix=succesful+and+sustainable+data+policie%2Caps%2C210&amp;sr=8-1"><span>Get The Book</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[A Business Understanding of Data Disposition]]></title><description><![CDATA[Understanding the Why, When and How of Data Disposition.]]></description><link>https://anchoringdata.com/p/a-business-understanding-of-data</link><guid isPermaLink="false">https://anchoringdata.com/p/a-business-understanding-of-data</guid><dc:creator><![CDATA[Malcolm Chisholm]]></dc:creator><pubDate>Fri, 23 Jan 2026 12:31:19 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!XcVc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63c4bae4-1275-4abd-8a00-248b52950a2b_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XcVc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63c4bae4-1275-4abd-8a00-248b52950a2b_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XcVc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63c4bae4-1275-4abd-8a00-248b52950a2b_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!XcVc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63c4bae4-1275-4abd-8a00-248b52950a2b_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!XcVc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63c4bae4-1275-4abd-8a00-248b52950a2b_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!XcVc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63c4bae4-1275-4abd-8a00-248b52950a2b_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XcVc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63c4bae4-1275-4abd-8a00-248b52950a2b_1536x1024.png" width="534" height="356.12225274725273" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/63c4bae4-1275-4abd-8a00-248b52950a2b_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:534,&quot;bytes&quot;:2755962,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/185465127?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63c4bae4-1275-4abd-8a00-248b52950a2b_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XcVc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63c4bae4-1275-4abd-8a00-248b52950a2b_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!XcVc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63c4bae4-1275-4abd-8a00-248b52950a2b_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!XcVc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63c4bae4-1275-4abd-8a00-248b52950a2b_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!XcVc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63c4bae4-1275-4abd-8a00-248b52950a2b_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A lot of data management, like a lot of household management, is about keeping things tidy. The problem is that data is abstract and not physical. You can see that the contents of your kitchen cabinet are a mess that needs to be tidied up, but it is very difficult to see, or even conceptualize, that there is a data mess. A particular area of data management where this becomes important is &#8220;data disposition&#8221; &#8211; essentially, getting rid of data, and we are going to take a look at it in this article.</p><h4>What Is Data Disposition?</h4><p>Even the term &#8220;data disposition&#8221; sounds a little clumsy. Many people prefer to say &#8220;data deletion&#8221;, but this does not really capture the concept either. We are talking about getting rid of data, and it can indeed be deleted, but it can also be anonymized in a way that renders it useless for normal processing. We will explore why you would want to do that shortly.</p><p>There are other somewhat bizarre terms used for data disposition, and &#8220;data lifecycle management&#8221; is a quite common one. But this gives the impression of managing the entire lifecycle of data, and the data lifecycle is complex with disagreements about what it is. But when people say &#8220;data lifecycle management&#8221; they nearly always mean data disposition.</p><h4>Why Dispose of Data?</h4><p>Years ago data disposition was not seen as a priority. Data could be left in place as long as it was useful. Even if it was not kept in particular processing environments it could be stored in backups that could be recovered on demand. Today, however, things are different. Here are some major reasons we need to get rid of data.</p><p><strong>Cost:</strong> It is often said that storage is cheap, but huge amounts of data will quickly add up to a meaningful expense. This is particularly true of Cloud environments where everything is constantly metered and charged back. Furthermore, there is always some processing that gets done on any data hosted in the Cloud, and the unnecessary processing is a further expense item. Getting rid of data reduces these costs, or at least prevents them from rising.</p><p>An important point about cost savings and cost avoidance is that they can be quantified. Staff who undertake data disposition can prove how much money they are saving, which is a really good metric to impress executives with.</p><p><strong>Risk: </strong>Does the enterprise really want to keep all its data? In many cases it does not because data represents risk. In the USA most people are aware that they need to keep financial information for 7 years as it is a requirement of the IRS. But when that statutory period exists it is wise to get rid of the information to prevent unnecessary exposure to tax investigations. This is not to suggest covering up malfeasance, but to reduce risk in terms of legal entanglements. eDiscovery in civil cases is another big risk, and not having data that needs to be defended is better than having to construct a defense for it. So if there is no obligation to keep data, then it is wise to dispose of it.</p><p><strong>Efficiency:</strong> Data needs to be governed and managed and this takes effort. Having many copies of the same legal agreement with different file names, and in different states, like draft, finalized, and revised, all stored in different locations is a recipe for chaos. Structured data can also suffer from such issues with a given database table being unnecessarily duplicated for one-off needs, and never cleaned up afterwards. This mess rapidly accumulates and confuses staff searching for data, and staff who use the wrong data.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Anchoring Data! This is a publication dedicated to shine a light on knowledge about what data is and how it is managed. <em>Please subscribe and join me in this data journey.</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h4>When Do We Dispose of Data?</h4><p>There are other reasons why data needs to be disposed of, but these are the major ones. Given this, when is data to be disposed of? </p><p>Here we have a problem, because data is not &#8220;one size fits all&#8221;. There are different kinds of data that have different business uses, and these dictate when it needs to be got rid of. Some data is subject to laws and regulations that dictate how long it must be kept for &#8211; or the circumstances under which it must be disposed of. Yet other data is so voluminous, e.g. data from sensors that are continuously creating data, that it cannot be allowed to exceed certain storage limits.</p><p>What this means is that each type of data needs to be identified, where each type is a kind of data that has its own data disposition rules. This is not easy, or intuitive. A top-down approach is to list all the rules from laws and regulations and for each area of the business to figure out what data they manage falls under any of these rules. Typically, this requires a &#8220;champion&#8221; model where someone in each business area leads the effort for their area. The champions have to be trained, and coordinated by some higher-level authority like Risk, Legal, or Data Governance.</p><p>A bottom-up approach is for each area to decide what data it is going to dispose of and under what circumstances. The top-down approach will not work for all data &#8211; only data that is subject to laws and regulations, so it is only partially derisking data. The bottom- up approach closes this gap. However, it requires a commitment to data management that many business units do not have.</p><p>A hybrid approach is to figure out centrally the different types of data in the enterprise and when to dispose of them. This goes beyond data subject to laws and regulations, and leads to policies for disposition. It still may not fully cover all types of data, but a good deal will be covered. This approach requires a strong Data Governance function to support the business units.</p><h4>How Is Data Disposed of?</h4><p>As noted above, data can simply be deleted. This is by far the most common method. But personal and confidential data can be anonymized. This is done when data is still needed for some other purpose beyond normal usage, e.g. analytics.</p><p>For structured data the process of disposition can be quite technical. For instance, a subset of records in a database has to be deleted, and the exact filtering conditions have to be applied to identify the right records. Then, the deletion process has to be proven to have worked. The number of records deleted must be logged along with when the process was carried out, and other information. This can be tricky as the information about the deleted records cannot be retained &#8211; you cannot search for records that were supposed to be deleted by using the data that is now supposedly deleted.</p><p>Disposition activities have to be scheduled, and this of course requires resources and effort. However, as we have seen, data disposition really cannot be avoided today.  </p><p>Businesspeople are going to be increasingly involved in data disposition as time goes by and computing resources become more expensive, so this is one area of data management where we can expect to see a lot of growth.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/p/a-business-understanding-of-data?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://anchoringdata.com/p/a-business-understanding-of-data?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>If you&#8217;d like to learn more about my background and what led me to write this publication, I invite you to click below to learn more about my journey.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/p/the-authors-data-journey?r=4733dz&amp;utm_campaign=post&amp;utm_medium=web&amp;triedRedirect=true&quot;,&quot;text&quot;:&quot;About The Author&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://anchoringdata.com/p/the-authors-data-journey?r=4733dz&amp;utm_campaign=post&amp;utm_medium=web&amp;triedRedirect=true"><span>About The Author</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Understanding Data Acquisition for Third Party Data]]></title><description><![CDATA[A clear explanation about the fundamentals of the Data Acquisition Cycle.]]></description><link>https://anchoringdata.com/p/understanding-data-acquisition-for</link><guid isPermaLink="false">https://anchoringdata.com/p/understanding-data-acquisition-for</guid><dc:creator><![CDATA[Malcolm Chisholm]]></dc:creator><pubDate>Fri, 16 Jan 2026 12:31:20 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/5c57e524-90dc-41f1-95a2-27e85e5f37f5_1748x1240.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Data Acquisition is a relatively new area of Data Management that has come to prominence in recent years. It is the process by which Third Party Data is brought into the enterprise and is particularly important for businesspeople to understand. </p><p>We reviewed Third Party Data in an earlier article <a href="https://anchoringdata.com/p/understanding-data-acquisition-and?r=4733dz">Understanding Data Acquisition and Third-Party Data</a>, but basically it is data that is created outside of the enterprise by other organizations and which the enterprise needs to bring into its data environment to use.</p><h4>IT versus The Business</h4><p>At first it might seem that Data Acquisition is a purely technical matter that IT can take care of. Part of it certainly is, but a great deal concerns the business. The diagram below summarizes the entire Data Acquisition Cycle, and the only &#8220;pure&#8221; IT part is ingestion which comes at the very end and is the loading of data from outside the enterprise into what is usually a data lake.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!teTl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ede99a1-c249-4005-962d-2ced440ffb0f_1794x996.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!teTl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ede99a1-c249-4005-962d-2ced440ffb0f_1794x996.png 424w, https://substackcdn.com/image/fetch/$s_!teTl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ede99a1-c249-4005-962d-2ced440ffb0f_1794x996.png 848w, https://substackcdn.com/image/fetch/$s_!teTl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ede99a1-c249-4005-962d-2ced440ffb0f_1794x996.png 1272w, https://substackcdn.com/image/fetch/$s_!teTl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ede99a1-c249-4005-962d-2ced440ffb0f_1794x996.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!teTl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ede99a1-c249-4005-962d-2ced440ffb0f_1794x996.png" width="728" height="404" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6ede99a1-c249-4005-962d-2ced440ffb0f_1794x996.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:808,&quot;width&quot;:1456,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:148664,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/184710067?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ede99a1-c249-4005-962d-2ced440ffb0f_1794x996.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!teTl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ede99a1-c249-4005-962d-2ced440ffb0f_1794x996.png 424w, https://substackcdn.com/image/fetch/$s_!teTl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ede99a1-c249-4005-962d-2ced440ffb0f_1794x996.png 848w, https://substackcdn.com/image/fetch/$s_!teTl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ede99a1-c249-4005-962d-2ced440ffb0f_1794x996.png 1272w, https://substackcdn.com/image/fetch/$s_!teTl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ede99a1-c249-4005-962d-2ced440ffb0f_1794x996.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>The Data Acquisition Cycle</h4><p>Of course, IT will be involved throughout the cycle, but the cycle is essentially business- driven. This really needs to be appreciated by businesspeople. Unfortunately, businesspeople sometimes simply identify &#8220;data&#8221; with &#8220;databases&#8221; that are maintained by IT, and think that it is all the responsibility if IT. But IT cannot be expected to be responsible for data content, which is something the business needs to address &#8211; although IT can provide some support.</p><h4>The Details of The Data Acquisition Cycle</h4><p>Let us now look at the Data Acquisition Cycle in more detail.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Anchoring Data! This is a publication dedicated to shine a light on knowledge about what data is and how it is managed. <em>Please subscribe and join me in this data journey.</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><strong>Use Case Crystallization:</strong> The description of the use case to be solutioned. It will identify any data that has to be acquired from outside the enterprise.</p><p><strong>Information Asset Design:</strong> The end product that has to be produced as a result of the implementation of the use case, and any asset that has to be developed to produce it. Today, the great majority of use cases are analytical in nature, so models including AI and their outputs are the information assets. These assets lie outside of the scope of data acquisition, but what is important is that they provide a more detailed understanding of the data that is required in data acquisition.</p><p><strong>Data Sourcing:</strong> The identification of possible sources of external data, including the data vendors that can supply this data.</p><p><strong>Data Vendor Engagement: </strong>Data vendors can be contacted based on the list developed during data sourcing. The objective is to determine if they can actually supply the data that is required, and how to proceed with them.</p><p><strong>Data Profiling:</strong> A dataset obtained from a data vendor is examined to understand its semantics and high-level data quality.</p><p><strong>Data Evaluation: </strong>The dataset is subject to testing to determine the extent to which it is relevant for the use case.</p><p><strong>Proof of Concept:</strong> Sometimes, the evaluation is turned into a proof of concept (PoC) where considerably more development is undertaken to find out if the use case can be solutioned. This is the equivalent of a small development project.</p><p><strong>Business Case Development:</strong> The use case is augmented with results of the data evaluation and/or proof of concept. A more detailed cost-benefit analysis with risk analysis is created for executive review.</p><p><strong>Executive Approval: </strong>The business case is reviewed by executives who will have to fund it. They will approve or reject the business case.</p><p><strong>Vendor Negotiation:</strong> At this point it is clear what data is needed from the data vendor. Procurement will lead the contracting effort to finalize the supply of data from the data vendor. Occasionally, these details have all been worked out previously and this step is not required.</p><p><strong>Implementation Specification: </strong>The design of the ingestion process.</p><p><strong>Ingestion Setup: </strong>The ingestion process is implemented.</p><p><strong>First Successful Implementation:</strong> The dataset is loaded via the ingestion process for the first time. Testing is undertaken to prove the ingestion was successful.</p><p><strong>Production Turnover: </strong>The ingestion process is turned over to the staff responsible for production operations, who integrate it into the broader solution for the use case. The actual data acquisition cycle can vary from project to project and enterprise to enterprise, but the above outline captures the most common steps.</p><h4>Why Does All This Have to be Formalized?</h4><p>It is not enough to understand that there is a Data Acquisition Cycle, but it is also necessary to centralize it and standardize it.</p><p>This is a departure from the old way of looking at data, and it may seem tempting for businesspeople to reach out directly to data vendors to find out what the vendors can offer. Many years ago nobody gave much thought to data, and staff operated under the impression that if they could access data within the enterprise they could use it for anything. This attitude became extended to external data, such that if businesspeople paid for data then there was no problem, and they had no need to consult anyone else.</p><p>However, Data Acquisition does need to be governed and managed for the following reasons:</p><h5>1. Licensed Internal Data Is Not Necessarily Available</h5><p>Businesspeople will likely look within the enterprise for data that they would otherwise need to source externally. But, any licensed data that exists in the enterprise will be subject to the contractual terms under which it was licensed. These terms can easily limit how the data can be used such that it must not be used in solutioning the new use case. The businesspeople involved may know nothing of these terms and may just find a way to get hold of the data, thus breaching the contract.</p><h5>2. Businesspeople May Try to Contact Data Vendors Directly</h5><p>It may seem a simple matter for project personnel to contact data vendors directly to find out what they have. However, this may circumvent contracts the enterprise may already have with data vendors. It may also compromise the enterprise if no confidentiality agreement is in place with the data vendor. Further, the businesspeople involved may not be aware of a data vendor that the enterprise has decided it cannot do business with, perhaps because the practices of the data vendor are considered risky.</p><h5>3. PII May Be Involved</h5><p>Today most businesspeople are aware that personal information (PII, or personally identifying information) needs special treatment, which may deter them trying to license such data on their own. That may still happen, but what is more likely is that the businesspeople may think the data they are buying is not PII, but it reality it does contain PII. For instance, a list of small businesses may include sole proprietorships. In some data privacy laws, sole proprietorships count as PII because they pertain to a specific individual.</p><h5>4. Regulatory Impact</h5><p>Use case crystallization is the point at which the use case is understandable in detail. This means that the use case can be judged in terms of data and AI regulations. For instance, it should be fairly easy to determine if the use case does or does not involve High Risk AI as defined in the Artificial Intelligence Act of the European Union. Similarly, the use case can be judged as to whether it must comply with the California Privacy Rights Act.</p><h5>5. Data and AI Ethics</h5><p>Though not strictly data legal concerns, the use case can also be judged in terms of data and AI ethics. The alignment to the stated values of the enterprise, and frankly decent behavior, should be confirmed.</p><p> These risks and problems can be mitigated with a centralized Data Acquisition function that incorporates strong governance. Unfortunately, the project mindset tends to make a project team want to cut all external dependencies so they can meet their deadlines, and so left to themselves the team that came up with the use case will not want to engage such a Data Acquisition function </p><p>This points to the need for a Data Acquisition policy that lays down how Data Acquisition will be done in the enterprise. Of course, execution is important, and no Data Acquisition function should be bureaucratic, slow moving, and inefficient. In recent years the practices of Data Acquisition have matured and are much better understood, so enterprises should be able to create these functions that are business-facing and successful.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/p/understanding-data-acquisition-for?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://anchoringdata.com/p/understanding-data-acquisition-for?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p><em>If you&#8217;d like to learn more about my background and what led me to write this publication, I invite you to click below to learn more about my journey.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/p/the-authors-data-journey?r=4733dz&amp;utm_campaign=post&amp;utm_medium=web&amp;triedRedirect=true&quot;,&quot;text&quot;:&quot;About The Author&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://anchoringdata.com/p/the-authors-data-journey?r=4733dz&amp;utm_campaign=post&amp;utm_medium=web&amp;triedRedirect=true"><span>About The Author</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[What Do I Need to Know about Data for My Job?]]></title><description><![CDATA[Here are some pointers on how a non-data specialist can become better at working with data.]]></description><link>https://anchoringdata.com/p/what-do-i-need-to-know-about-data</link><guid isPermaLink="false">https://anchoringdata.com/p/what-do-i-need-to-know-about-data</guid><dc:creator><![CDATA[Malcolm Chisholm]]></dc:creator><pubDate>Fri, 09 Jan 2026 12:03:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!xULj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38698ce-6cba-4bb8-8fa6-d6e3603db9f4_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xULj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38698ce-6cba-4bb8-8fa6-d6e3603db9f4_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xULj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38698ce-6cba-4bb8-8fa6-d6e3603db9f4_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!xULj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38698ce-6cba-4bb8-8fa6-d6e3603db9f4_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!xULj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38698ce-6cba-4bb8-8fa6-d6e3603db9f4_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!xULj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38698ce-6cba-4bb8-8fa6-d6e3603db9f4_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xULj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38698ce-6cba-4bb8-8fa6-d6e3603db9f4_1536x1024.png" width="416" height="277.42857142857144" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c38698ce-6cba-4bb8-8fa6-d6e3603db9f4_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:416,&quot;bytes&quot;:2450948,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/183927381?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38698ce-6cba-4bb8-8fa6-d6e3603db9f4_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xULj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38698ce-6cba-4bb8-8fa6-d6e3603db9f4_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!xULj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38698ce-6cba-4bb8-8fa6-d6e3603db9f4_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!xULj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38698ce-6cba-4bb8-8fa6-d6e3603db9f4_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!xULj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc38698ce-6cba-4bb8-8fa6-d6e3603db9f4_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Data has crept up on the workforce. Decades ago, data was practically irrelevant for everyone, and even technical people did not pay much attention to it. That has changed completely, and now data matters in all kinds of ways. Because of this, employers expect employees to be able to work effectively with data, but most organizations provide little training or education about data other than what is required for compliance reasons.</p><p>In this article I will try to provide some pointers on how a non-data specialist can become better at working with data.</p><h4>Competencies</h4><p>Before we start on the specifics it is important to understand that we are talking about &#8220;competencies&#8221;. A competency is an attribute that enables someone to do a job well. In general, there are three kinds of competency:</p><ul><li><p><strong>Skills:</strong> Technical abilities to do a particular kind of work. E.g. ability to manipulate</p><p>data in Excel.</p></li><li><p><strong>Knowledge:</strong> Having factual information about the specific work at hand, and</p><p>understanding this information. E.g. understanding how data about suppliers flows through the organization.</p></li><li><p><strong>Personal Traits: </strong>These break down into innate or learned abilities, and also</p><p>behavioral characteristics. E.g. being detail oriented, and being interested in</p><p>catching mistakes.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Anchoring Data! This is a publication dedicated to shine a light on knowledge about what data is and how it is managed. <em>Please subscribe and join me in this data journey.</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div></li></ul><p>We will examine all of these to appreciate how someone can work better with data. But before we do that, what do we mean by &#8220;working with data&#8221;? Basically, there are two kinds of ways non-data-technical staff may interact with data:</p><p>(a)<strong> Indirect Data Management:</strong> This is where staff have to update, read, and analyze data as part of a business function. E.g. a nurse who enters medical information about a patient. By far this is how most employees outside of IT and technical data units interact with data.</p><p>(b) <strong>Direct Data Management: </strong>This is where staff are looking after data for the enterprise. E.g. someone who sets up customer records as part of customer onboarding. These are non-technical jobs, but are wholly concerned with data, and not the use of the data.</p><p>Obviously, Direct Data Management is going to be more demanding in terms of data- related competencies than Indirect Data Management, but the latter is still very important because there are high risks of creating data errors.</p><p>We can think of a scale of competency needs. Perhaps on the lower end we have someone doing repetitive data entry, and on the other extreme someone involved in analyzing data to make business decisions for the enterprise. Even so, across this entire spectrum there are still competency needs. Let&#8217;s look at three basic ones in each of the competency categories we have described.</p><h4>Personal Traits</h4><p>The following personal traits are important for data management:</p><p><em><strong>Detail Oriented: </strong></em>Data values are designed to be atomic, and as such exist at a low level of detail. Anyone dealing with data must be able to deal with a lot of detail and focus on individual items within a mass of detail.</p><p><em><strong>Orderly:</strong></em> It is very easy to create a mess with data. Just the placement of files in folders seems to be a daunting task in most organizations. It takes a considerable effort to establish order, but it is very important with data.</p><p><em><strong>Abstraction:</strong></em> The ability to conceive of something and manipulate it mentally. Data is non-material, so understanding it is less easy than physical things. Data professionals often say there are &#8220;data people&#8221; and those who cannot &#8220;do data&#8221;, meaning those who can and cannot think abstractly. For everyone who works with data &#8211; not just professionals &#8211; the ability to think abstractly is important.</p><h4>Skills</h4><p>The skills around data are technical skills. There are perhaps not so many employees who need these skills, but some roles certainly do.</p><p><em><strong>SQL: </strong></em>Like it or not, SQL has become the universal language of data retrieval from databases. If business staff need to retrieve data and do not know SQL, then they will have to rely on technical staff to do it for them, which may never happen. AI may be helpful here, but the prompts have to be exact.</p><p><em><strong>Excel: </strong></em>If SQL retrieves data, then Excel is used to manipulate it. Most organizations run on Excel, even if IT departments do not like this. Many careers have been greatly enhanced for people who know Excel.</p><p><em><strong>Communication:</strong></em> While in some ways this is a personal trait, it is also a skill that can be learned. Communicating about data requires clarity and precision. It is too easy to create confusion about data with sloppy written or verbal content.</p><h4>Knowledge</h4><p><em><strong>Data Definitions:  </strong></em>Staff need to understand what the data they are working with means. This is not just what we think of as a definition, like what we see in a dictionary. Rather it is more like a wiki entry. To understand data, it is necessary to know what populations of things are covered, what quality issues exist in it, the methodology by how it was gathered, and so on. Now, this is not needed for all data, but some of it is needed for some data. Assumptions about data can cause huge problems if they turn out not to be justified.</p><p><em><strong>Permitted Usage of Data</strong></em>: Long ago it was assumed that any data within an enterprise could be used for any purpose. Today we need to know what is allowed. Data Privacy has been the biggest driver of this, but even today many people still think that just having access to data means they have permission to do whatever they like. Besides Data Privacy there are also contractual obligations that apply to licensed data purchased from data vendors. Knowledge about what can and cannot be done with data is very important.</p><p><em><strong>Data Sources and Structures:</strong></em> Knowledge about what data exists where in the organization and how it is structured can be very useful. This is a kind of situational awareness for someone who works with data, and can prevent surprises as well as suggest opportunities.</p><h4>Gaining Competencies</h4><p>We have only looked at basic competencies and there are many more. And of course there are many other specific competencies that are necessary for specialized roles that deal with data.</p><p>What is much more important is how an average person can be expected to acquire these competencies. In theory it would be to the advantage of every enterprise to help grow these competencies in as many employees as possible. However, this does not seem to be something that is widespread, which is unfortunate.</p><p>This leaves it up to the individual. The skills competencies are fully transferable, as are the personal traits, but the knowledge competencies are very much tied to the enterprise and not transferable for the most part. So individuals can be self-motivated to acquire data skills, but less so when it comes to knowledge. However, some individuals do choose to acquire knowledge about the enterprise they work for and become very valuable to the enterprise as a result.</p><p>This situation is not perfect and for the individual there are tradeoffs when it comes to acquiring competencies. That said, data competencies are increasingly needed in the modern economy and should be something every employee thinks about for their own situation.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/p/what-do-i-need-to-know-about-data?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://anchoringdata.com/p/what-do-i-need-to-know-about-data?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Data Quality Issue Management for Businesspeople]]></title><description><![CDATA[Data quality is the discipline of making sure that data actually represents what it is supposed to, and can be used for its intended purposes.]]></description><link>https://anchoringdata.com/p/data-quality-issue-management-for</link><guid isPermaLink="false">https://anchoringdata.com/p/data-quality-issue-management-for</guid><dc:creator><![CDATA[Malcolm Chisholm]]></dc:creator><pubDate>Fri, 19 Dec 2025 13:31:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!vk8t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aebe9f8-b986-4e8a-ab5a-21922724bcff_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vk8t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aebe9f8-b986-4e8a-ab5a-21922724bcff_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vk8t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aebe9f8-b986-4e8a-ab5a-21922724bcff_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!vk8t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aebe9f8-b986-4e8a-ab5a-21922724bcff_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!vk8t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aebe9f8-b986-4e8a-ab5a-21922724bcff_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!vk8t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aebe9f8-b986-4e8a-ab5a-21922724bcff_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vk8t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aebe9f8-b986-4e8a-ab5a-21922724bcff_1536x1024.png" width="464" height="309.43956043956047" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3aebe9f8-b986-4e8a-ab5a-21922724bcff_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:464,&quot;bytes&quot;:1677703,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/182000171?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aebe9f8-b986-4e8a-ab5a-21922724bcff_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vk8t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aebe9f8-b986-4e8a-ab5a-21922724bcff_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!vk8t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aebe9f8-b986-4e8a-ab5a-21922724bcff_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!vk8t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aebe9f8-b986-4e8a-ab5a-21922724bcff_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!vk8t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aebe9f8-b986-4e8a-ab5a-21922724bcff_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Everyone is aware of the divide between IT and the business. How it originated, why it persists, and how it can be fixed are topics that have been debated for years &#8211; with limited success. This divide pops up in different places with different impact, and one of them is the area of data quality.</p><p>Data quality is the discipline of making sure that data actually represents what it is supposed to, and can be used for its intended purposes. Think of it as the health of the data. But sometimes things go wrong with data, and when that happens it is important to detect the issue and to fix it.</p><h4>What Is Data Issue Management?</h4><p>This is where the divide between IT and the business comes in. IT is oriented to technology as the solution to problems, and technology can be used to detect data quality issues. Perhaps not all data quality issues can be detected with technology, but a lot can be with modern data quality tools (more recently rebranded as &#8220;data observability&#8221; tools).</p><p>But what happens when a data quality issue is spotted by one of these tools, or as still often happens, by a businessperson looking at data in an output of a system? There is no magic technology that can automate the resolution of the problem &#8211; it is pretty much dependent on human effort. So at this point, IT usually either drops out of the picture completely, or takes a subordinate role, waiting to be told what to do by the business.</p><p>What has to happen is to drive the data issue to a resolution &#8211; a process known as Data Issue Management. It is not easy to define Data Issue Management because it is a series of steps that come into play. Maybe not every step is needed for a particular data issue, but in aggregate they are:</p><ul><li><p>Issue confirmation: Is the &#8220;data issue&#8221; really an issue, an issue or not. Perhaps it could be a misunderstanding.</p></li><li><p>Known Issue: Is the data issue one that has been seen before? If so, is there a</p><p>known resolution that can be applied, such that the issue is fixed right away?</p></li><li><p>Impact Assessment: What impact does the data issue have? The immediate</p><p>impact, if any, is important to know, so that propagation and knock-on effects can be stopped.</p></li><li><p>Notification: Who needs to know about the data issue, and who can help in its resolution? Most importantly, who will coordinate the overall data issue</p><p>management process.</p></li><li><p>Analysis: Understanding the cause of the data issue, and its impact beyond</p><p>anything immediate.</p></li><li><p>Resolution Design: The agreed-on fix for the data issue</p></li><li><p>Handover: The handover of the resolution to the staff who will implement it. Resolutions can involve process re-engineering, training and education of people,</p><p>and technical fixes. A resolution could be a mix of them all.</p></li><li><p>Closure: Informing the stakeholders involved in Data Issue Management that the</p><p>data issue has been resolved.</p></li></ul><p>This list is high-level and each step can be quite complex in its own right.</p><p>It is also important to note that Data Issue Management leads to the design of a resolution. Who implements that actual resolution will vary. If it is technical, then IT staff will implement it. If it is a failure of training, then perhaps Human Resources will develop new training modules. It is the nature of Data Issue Management that the resolutions can be very diverse.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Anchoring Data! This is a <strong>new</strong> publication dedicated to shine a light on knowledge about what data is and how it is managed. <em>Please subscribe and join me in this data journey.</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h4>Local vs. Systemic Data Issues</h4><p>Another wrinkle of Data Issue Management is that some data issues can be very localized, whereas others are systemic.</p><p>A localized data issue is one that occurs in a single environment like a business unit or system. Its cause and effects are completely confined to this context. As such it is easy for the team involved to run the entire Data Issue Management process and not involve anyone else.</p><p>Systemic data issues are data issues that cross organizational and/or system boundaries. They are a result of the fact that a great deal of data flows across enterprises, often in ways that are not fully understood.</p><p>Data Issue Management for systemic data issues can be tricky. This is because at the point of origin, the &#8220;data issue&#8221; may be unimportant and not considered to be an issue. It only manifests as an issue that has a significant negative impact much further downstream in the flow of data. Thus, the staff that are impacted are not empowered to fix the root cause of the problem as they have no authority over the team where the root cause lies.</p><p>Systemic data issues require an enterprise-level response that includes strong governance. Appropriate Data Issue Management processes can then be built. Unfortunately, this is rarely done, and in many enterprises systemic data issues persist for years. Ultimately, it is a failure of Data Governance leadership. Data Governance units cannot wait to be told what to do about systemic data issues; they should develop the Data Issue Management policies, standards, and processes to address them.</p><h4>Data Products</h4><p>The evolution of data engineering teams, and especially the development of data products by these teams, has broken with more traditional approaches to Data Issue Management. Decades ago, during mainframe days, in-house production control teams oversaw a lot of Data Issue Management processes. These teams ran production jobs, and when an error was discovered, they would coordinate the response. Today, data engineering teams that build data pipelines also run them. When these data pipelines produce data products, then the expectation is that any data issue in the data product will be handled by the data engineering team involved.</p><p>This approach has provided more clarity about who will coordinate Data Issue Management. Of course, a data issue in a data product may have been caused by an upstream data source outside of the control of the data engineering team. However, there is at least an understanding of who will drive the Data Issue Management process.</p><h4>Fragmentation and Lack of Definition</h4><p>Despite the promising developments in the area of data products, Data Issue Management remains fragmented in many organizations. Data issues are often not even recognized as such, and Data Issue Management processes &#8211; to the extent they exist &#8211; are developed ad hoc in different units. Not all of the steps in Data Issue Management that were outlined above are followed, so these processes do not always generate good resolutions. Data Issue Management processes consist of people and processes much more than technology, which is why IT is much less involved. It may also account for why we see such fragmentation and lack of definition.</p><p>Ultimately, Data Governance within each organization will determine how successful Data Issue Management will be.</p>]]></content:encoded></item><item><title><![CDATA[Data Quality: What Every Business Professional Should Know]]></title><description><![CDATA[Businesspeople have all kinds of use cases for data, and they need data that is suited to each particular use case.]]></description><link>https://anchoringdata.com/p/data-quality-what-every-business</link><guid isPermaLink="false">https://anchoringdata.com/p/data-quality-what-every-business</guid><dc:creator><![CDATA[Malcolm Chisholm]]></dc:creator><pubDate>Thu, 11 Dec 2025 13:31:08 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!NuQY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff561f772-1e8e-40d9-b3ef-b6b73389ec71_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NuQY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff561f772-1e8e-40d9-b3ef-b6b73389ec71_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NuQY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff561f772-1e8e-40d9-b3ef-b6b73389ec71_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!NuQY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff561f772-1e8e-40d9-b3ef-b6b73389ec71_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!NuQY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff561f772-1e8e-40d9-b3ef-b6b73389ec71_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!NuQY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff561f772-1e8e-40d9-b3ef-b6b73389ec71_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NuQY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff561f772-1e8e-40d9-b3ef-b6b73389ec71_1536x1024.png" width="536" height="357.45604395604397" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f561f772-1e8e-40d9-b3ef-b6b73389ec71_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:536,&quot;bytes&quot;:2314532,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/181260836?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff561f772-1e8e-40d9-b3ef-b6b73389ec71_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NuQY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff561f772-1e8e-40d9-b3ef-b6b73389ec71_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!NuQY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff561f772-1e8e-40d9-b3ef-b6b73389ec71_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!NuQY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff561f772-1e8e-40d9-b3ef-b6b73389ec71_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!NuQY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff561f772-1e8e-40d9-b3ef-b6b73389ec71_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The term &#8220;data quality&#8221; is used extensively by technical professionals working in the data, AI, and information technology industries, but it can be difficult to understand what these professionals mean by it. This is important because businesspeople are increasingly involved in using data, and they need to know they can trust it. To do so, businesspeople rely to some extent on the technical professionals telling them that it is &#8220;good quality&#8221; data. But frustratingly, &#8220;good quality&#8221; data may not always be usable in a particular business context.</p><p>Also, the term &#8220;data quality&#8221; can refer to many different specific things, which adds to the confusion that businesspeople can feel when they hear about it from the technical folks.</p><p>Let&#8217;s take a high-level look at what &#8220;data quality&#8221; can mean.</p><h4>The Scope of Data Quality</h4><p>The first thing that we need to realize is that when a technical folk talk about data quality, the scope of what they are thinking about is limited to:</p><p>(a) Data inside computerized environments</p><p>(b) Data in environments controlled by the Information Technology (IT) department</p><p>(c) Structured data, meaning data represented in rows and columns, like database</p><p>tables.</p><p>It may sound like a wide scope, but it is not. There are plenty of issues with collecting data before it even gets into a computerized system. For instance, we are quite familiar with the confusing Unemployment Rate and Jobs numbers reported by the US government every month. The methodologies used to gather this information are questionable, so the quality of the &#8220;raw data&#8221; produced is also questionable.</p><p>When it comes to activities going on outside of computerized environments, the technical folk consider it none of their business, even though the data is affected. This is not part of the data supply chain they are involved in.</p><p>Not all data is managed in systems controlled by IT. There is plenty of End User Computing (EUC) in every organization, as is seen in Excel spreadsheets. This is generally outside of what technical professionals see as their responsibilities, so they are not involved in any aspect of its quality. Such data may make its way into IT- controlled environments, and after that point (but not before it) technical staff will address quality &#8211; but only based how the data is processed in the IT-controlled environments.</p><p>Then we have structured data. Quality in unstructured data, like text, images, audio, and video, is not typically something that has traditionally been addressed from a technical perspective. So while structured data may have its data quality needs taken care of, unstructured data is left alone, which is a problem since AI is not a heavy user of unstructured data.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Anchoring Data! This is a <strong>new</strong> publication dedicated to shine a light on knowledge about what data is and how it is managed. <em>Please subscribe and join me in this data journey.</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h4>The Main Activities of Data Quality</h4><p>&#8220;Data Quality&#8221; is not just a state of data. The term also refers to a whole subdiscipline of data management. However, this subdiscipline itself consists &#8211; confusingly &#8211; of wildly different activities which receive wildly different attention from the technical community.</p><p>These subdisciplines are:</p><p>(a) Data Issue Prevention</p><p>(b) Data Issue Detection</p><p>(c) Data Issue Management</p><p>(d) Data Issue Remediation (also known as Data Change Management)</p><h4>Data Issue Prevention</h4><p>Data Issue Prevention is of no particular concern to technical staff. If asked about it, they will usually say that it is taken care of by edit validations built into data entry screens in whatever system data is being entered into. As such it is nothing special, apart from being an important task in systems development. There is no idea that people and processes should be examined to determine if they are causing bad quality data. This goes back to &#8220;data quality&#8221; from the technical perspective being about what happens inside computers.</p><h4>Data Issue Detection</h4><p>By contrast, Data Issue Detection receives a huge amount of attention from technical staff. It seems that this is because there are a lot of software tools that have been built to detect poor quality data. Technical people tend to focus on technology as being the answer to problems, and these tools are very impressive. The tools also require technical people to run them, so technical staff are highly incentivized to be able to use them.</p><p>&#8220;Data Quality&#8221; tools, or &#8220;Data Observability&#8221; tools as they are increasingly called today, use two approaches. One is for the tool by itself to try to identify bad data. This usually requires a human to confirm it is bad. The other approach is for specific rules provided by the business to be run by the software to detect data issues. Very often, when technical people speak of &#8220;data quality&#8221; they are really thinking only of these tools.</p><h4>Data Issue Management</h4><p>Data Issue Management is what happens after a data quality issue is detected. The issue has to be understood and a resolution proposed for it. Some data quality issues are completely technical and it is up to technical staff to fix them. But a lot &#8211; especially issues with data content &#8211; require the business to be involved. When this happens, the activities of Data Issue Management become more people and process oriented and technical folk are more passive, waiting to be asked for information or told to do something.</p><p>However, technical staff may get involved heavily if there is ticketing software involved. This will happen if data issues are reported via a Help Desk and tickets are opened. From the technicians&#8217; perspective it all becomes about the administration and management of the tickets at this point, rather that the substance of solving the data issue. Businesspeople can find this very frustrating.</p><p>Overall, technical staff do not usually think of Data Issue Management as part of &#8220;data quality&#8221;.</p><h4>Data Issue Remediation</h4><p>Data Issue Remediation is applying the resolution recommended via Data Issue Management. While some fixes can be technical in nature, others are more oriented to people and processes, and technical staff are less involved. Where a technical change is required, this flows through specific IT procedures that are for general changes in IT environments. These activities are not considered as &#8220;data quality&#8221; by technical staff.</p><p>One alarming tendency is for technical staff to implement workarounds in technical environments like systems and databases to get over a data issue. The root cause is not addressed. But for technical staff this seems natural if the root cause lies outside any computerized environment they support. It is the only thing they can do. The result is a gradual accumulation of technical fixes which nobody can remember the reasons for and which impede both technical and business change.</p><h4>Data Fitness for Use</h4><p>A final area where business and technical views about &#8220;data quality&#8221; diverge is fitness for use. Businesspeople have all kinds of use cases for data, and they need data that is suited to each particular use case. So it is natural to ask if the data is of the right quality for the use case. The same data may work well for one use case but not for another. Where the data does not work well, the business often sees it as a &#8220;data quality&#8221; issue. This can be puzzling to technical staff who may point out that no data quality issues have been detected in the data involved. Once again, the business and technical concepts of &#8220;data quality&#8221; are not the same.</p><p>There is a lot more that could be said on this topic, but hopefully it conveys some of the boundaries that technical staff have concerning data quality, and why communication between business and technical people about &#8220;data quality&#8221; can be difficult .</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/p/data-quality-what-every-business?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://anchoringdata.com/p/data-quality-what-every-business?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[Understanding Data Acquisition and Third-Party Data]]></title><description><![CDATA[When computers first became widespread, data was thought of as a byproduct of processing and not something in its own right.]]></description><link>https://anchoringdata.com/p/understanding-data-acquisition-and</link><guid isPermaLink="false">https://anchoringdata.com/p/understanding-data-acquisition-and</guid><dc:creator><![CDATA[Malcolm Chisholm]]></dc:creator><pubDate>Fri, 05 Dec 2025 12:20:35 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!UgVZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1ab0fbf-2427-4be4-a0b6-2c3a3b086f98_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UgVZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1ab0fbf-2427-4be4-a0b6-2c3a3b086f98_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UgVZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1ab0fbf-2427-4be4-a0b6-2c3a3b086f98_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!UgVZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1ab0fbf-2427-4be4-a0b6-2c3a3b086f98_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!UgVZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1ab0fbf-2427-4be4-a0b6-2c3a3b086f98_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!UgVZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1ab0fbf-2427-4be4-a0b6-2c3a3b086f98_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UgVZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1ab0fbf-2427-4be4-a0b6-2c3a3b086f98_1536x1024.png" width="490" height="326.77884615384613" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c1ab0fbf-2427-4be4-a0b6-2c3a3b086f98_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:490,&quot;bytes&quot;:2383399,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/180761959?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1ab0fbf-2427-4be4-a0b6-2c3a3b086f98_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UgVZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1ab0fbf-2427-4be4-a0b6-2c3a3b086f98_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!UgVZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1ab0fbf-2427-4be4-a0b6-2c3a3b086f98_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!UgVZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1ab0fbf-2427-4be4-a0b6-2c3a3b086f98_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!UgVZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1ab0fbf-2427-4be4-a0b6-2c3a3b086f98_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Data has increasingly become a business over the past couple of decades, and it is important for data practitioners and businesspeople alike to understand how to deal with the economic and legal implications involved. </p><p>Nowhere has this become so important as in what is often called &#8220;Third Party Data&#8221; that is intimately linked with the process of Data Acquisition. This article looks at what is involved in both.</p><h4>Where Did Third Party Data Come From?</h4><p>When computers first became widespread, data was thought of as a byproduct of processing and not something in its own right. All data was linked to the internal systems that automated hitherto manual processes, meaning that the data was generated by the enterprise itself. Of course, this was not totally the case, as some data like tax rates and country codes always had to come from the outside.</p><p>This began to change when some industries realized they needed to get data to run certain specialized systems. For example, Bloomberg L.P. was founded in 1981 to supply data about financial markets to financial services institutions. This included lists of financial instruments like stocks and bonds, and their prices on stock exchanges. Financial services institutions had to buy this data from Bloomberg.</p><p>This situation was replicated in other enterprises where there was a need for external data to drive operational processes. Data vendors emerged to meet these needs and enterprises purchased the data from them. There was really no alternative as the enterprises requiring the data could not do it for themselves in a cost-effective manner.</p><p>The result was that a set of &#8220;traditional&#8221; data vendors grew up that supplied enterprises with data they needed for specialized use cases, the majority of which were for operational systems. Some of the use cases were uncommon, like securities trading platforms. Others were more widespread like credit information about businesses or individuals for loan origination.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Anchoring Data! This is a <strong>new</strong> publication dedicated to shine a light on knowledge about what data is and how it is managed. <em>Please subscribe and join me in this data journey.</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h4>The Rise of Big Data</h4><p>This situation persisted until about 2010 when &#8220;Big Data&#8221; came onto the scene. In reality, the size of the data was not as relevant as the software that enabled advanced analytics and data science.</p><p>There had always been reporting from computer systems, but the reporting gradually shifted to predictive and prescriptive analytics with the Big Data technology. However, enterprises found that the information available from their operational systems was not sufficient for these analytics. Other data from outside was needed, such as weather data, surveys of customer sentiment, econometric data, consumer prices, and so on. It was by understanding patterns in this data integrated with operational systems data that enterprises could unlock real value from their analytics.</p><h4>The Data Lake</h4><p>One very important concept from Big Data that enabled the use of Third-Party Data was the Data Lake. A Data Lake is a large storage environment where almost any kind of data can be stored. Previously, there had really only been Data Warehouses, which had a fixed data architecture that had to be specified in advance. By contrast, nothing needed to be prespecified in a Data Lake in order to put data into it. And the data that was put into Data Lakes was in the form of files.</p><p>So, Data Lakes combined with the new analytics technologies enabled enterprise to develop solutions for huge numbers of use cases that required Third-Party Data. The demand for Third-Party Data skyrocketed in the period 2010-2020. </p><h4>Data Chaos</h4><p>As often happens, technology enablement with strong demand was not matched by data management practices. Different units of enterprises went out and purchased data without any central coordination. The same data might be purchased independently by different units, sometimes for different prices. Legal and Procurement departments may not have been involved, so the terms and conditions were poorly understood. The Third-Party data might get ingested into all kinds of different systems, making further distribution difficult.</p><h4>Data Acquisition</h4><p>It was in order to fix these problems &#8211; or prevent them from arising &#8211; that the practice of Data Acquisition emerged. This seeks to manage the entire lifecycle of all Third-Party data, from its initial identification to its distribution within the enterprise, and finally to discontinuation. Data Acquisition is much more than mere ingestion, which is only the technical movement of data from outside the enterprise to within it.</p><p>Data Acquisition is centered around Data Lakes, as this is the central point where all Third-Party data is landed when it first comes into the enterprise. From there, it can be further distributed in a controlled fashion, as per the standard operating procedures established for Data Acquisition.</p><p>Usually there is a Data Acquisition Manager with a team overseeing all of this. The processes are standardized and all relevant partners are involved. This includes Legal to review terms and conditions of all data contracts to ensure they are acceptable, and Procurement to ensure pricing is acceptable and standard Procurement processes are followed. The Data Acquisition team will determine if any Third-Party data being requested is already available, and that proper data management needs like data privacy and dataset cataloging are carried out.</p><h4>Beyond Third Party Data</h4><p>While our focus here has been on Third Party Data &#8211; data purchased from data vendors&#8211; there is actually a much wider range of external data that is used by modern enterprises. Such data may be sourced directly by scraping from the Internet. Or it may be the result of surveys that are sent out. There are other modalities too. What is important is that the data is still handled by Data Acquisition. It is striking how Data Acquisition emerged from nowhere over the course of a few years to become such an important part of data management.</p>]]></content:encoded></item><item><title><![CDATA[The 4 AI Narratives: No Wonder We Are Confused]]></title><description><![CDATA[Let's dive into these narratives.]]></description><link>https://anchoringdata.com/p/the-4-ai-narratives-no-wonder-we</link><guid isPermaLink="false">https://anchoringdata.com/p/the-4-ai-narratives-no-wonder-we</guid><dc:creator><![CDATA[Malcolm Chisholm]]></dc:creator><pubDate>Fri, 21 Nov 2025 14:31:19 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!SPpn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d61079-926a-4f0c-bd9b-38494433cf0f_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SPpn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d61079-926a-4f0c-bd9b-38494433cf0f_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SPpn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d61079-926a-4f0c-bd9b-38494433cf0f_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!SPpn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d61079-926a-4f0c-bd9b-38494433cf0f_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!SPpn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d61079-926a-4f0c-bd9b-38494433cf0f_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!SPpn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d61079-926a-4f0c-bd9b-38494433cf0f_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SPpn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d61079-926a-4f0c-bd9b-38494433cf0f_1536x1024.png" width="478" height="318.7760989010989" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/27d61079-926a-4f0c-bd9b-38494433cf0f_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:478,&quot;bytes&quot;:2256326,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/179487822?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d61079-926a-4f0c-bd9b-38494433cf0f_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SPpn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d61079-926a-4f0c-bd9b-38494433cf0f_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!SPpn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d61079-926a-4f0c-bd9b-38494433cf0f_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!SPpn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d61079-926a-4f0c-bd9b-38494433cf0f_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!SPpn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d61079-926a-4f0c-bd9b-38494433cf0f_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We are now in mid-November 2025 and there now seem to be four narratives that are dominating news cycles when it comes to AI. None of these narratives fully aligns to any of the others, which seems to be creating a lot of confusion. Let&#8217;s try to sort though them.</p><h4>The Technical Narrative</h4><p>This is what dominates the &#8220;technical&#8221; community. AI is replicating what we first saw to some extent when the Internet started, and much more clearly when Big Data came to the fore around 2010. A set of new technologies has emerged that are clearly going to be in demand. The &#8220;tech bro&#8217;s&#8221; &#8211; the people who make a living from technologies &#8211; immediately understood that they needed to get AI on their resumes to stay relevant and command greater salaries during the inevitable growth phase of AI adoption.</p><p>These are not only programmer types, but also anyone needed to make AI successful. Management consultants providing strategic advice on AI, learning and development companies teaching AI skills, AI Governance professionals, and a myriad of others have all jumped on the AI bandwagon.</p><p>There is still a high level of evolution in AI technologies which requires this broader technical community to keep up with advances. For instance, initially there was Generative AI, and then Agentic AI appeared. There are plenty of all sorts of changes going on all the time. This rate of innovation requires constant attention that creates its own kind of enthusiasm among practitioners.</p><p>The newsflows from the companies that are developing AI technologies and the hyperscalers bolster the technical narrative massively. Overall, it is very positive and hopeful.</p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Anchoring Data!  I&#8217;m Malcolm Chisholm and this is a <strong>new</strong> publication dedicated to providing relevant information on data. Please subscribe and join me in this data journey. </p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h4>The Financial Narrative</h4><p>AI is not just about technology, but also about money, or rather, credit. In the hyperfinancialized US economy the opportunities are seen as not just the technologies themselves but the ability to construct debt-based structures around AI. For instance, the financing of data centers with debt is a rapidly rising concern. Coreweave, an AI Cloud infrastructure company, lost 50% of its market value in the past month due to doubts about its ability to get the data centers built.</p><p>There is also a big controversy on how to amortize the depreciation of GPUs that is affecting many players in the AI infrastructure space. The effective lifespan of a GPU is not just a technical question &#8211; it affects the financial models built into the debt-based financing. This in turn elevates financial risk, which will inevitably lead to a rise in financing costs &#8211; or worse.</p><p>There is a lot more to the financial narrative, but suffice it to say that it is now very gloomy. There is talk of an &#8220;AI Bubble&#8221; &#8211; meaning a financial bubble that has nothing to do with technology. But you can see how this pessimism gets conflated with the extreme optimism of the technical narrative to cause confusion.</p><h4>The Economic Narrative</h4><p>The economic narrative has changed over the time since the end of 2022 when it first became apparent that new AI technologies were appearing. Initially, it seemed that AI would generate new opportunities for all, but especially the technical community as they would be needed to make it work.</p><p>However, it has become increasingly apparent that there is no single &#8220;killer use case&#8221; for AI that will generate massive economic gains across a broad range of economic sectors. Instead, we hear a lot about efficiency gains. In other words, people will be fired from their jobs and replaced by AI. Such efficiency gains will improve the bottom lines of companies. Something similar happened in the &#8220;downsizing&#8221; wave of the late 1980&#8217;s, when PC&#8217;s eliminated the need for secretarial support and a lot of middle managers.</p><p>But the big losers are developers. Based on anecdotes I have heard dating back to 2022, it was developers who were the first target. The legions of developers and the inefficient project management techniques used to develop software were seen as a huge problem by the leaders of technology companies. With AI able to generate code, what required a team of developers can now be done by one very competent person using AI coding assistants.</p><p>More suspiciously, we hear news of large-scale layoffs in many industries blamed on AI. It is suspicious because there is not the wide-scale adoption of AI that would trigger this. It seems that AI is an excuse rather than a reason, but it is certainly adding to the economic narrative.</p><p>What is not part of this narrative is my personal observation that AI is being used to solution a myriad of use cases that are very diverse, and have little in common. It seems to me that in these use cases AI is doing things that no human does today because it would be economically impossible. I have heard this from others too. However, it is not part of the economic narrative, which remains a mix of enthusiasm and gloom, trending gloomy.</p><h4>The Infrastructure / Commodities Narrative</h4><p>This narrative is definitely out there, but is not quite as prominent as the others.</p><p>AI does not live in isolation, but in data centers. Data centers are infrastructure that needs to be built and operated. The operation requires electricity &#8211; and lots of it. Electricity has to be generated from a source of energy, e.g. oil, natural gas. The electrical generation capacity has to exist, along with the transmission capability. This requires a lot of mechanical parts, like gas turbines. These in turn require commodities like copper , silver, and aluminum, as well as construction equipment and supplies. There is also the entire legal framework like permitting to be considered as data centers are built out. And don&#8217;t forget the engineers to build it all, of which the US produces rather few.</p><p>The part of this narrative that seems to be getting most attention is electrical demand. Ordinary people are seeing their electricity bills rise, and are blaming AI for it. The narrative is also leading a lot of people to ask if we can build all this infrastructure giving the lack of a solid manufacturing base, not to mention the timeframes involved. This narrative is making people hostile to AI, or the tech companies purveying it.</p><h4>Too Many Narratives</h4><p>There may be other AI narratives out there in the world (e.g. national security, general brain rot, AI enabled scams), but these 4 alone are driving confusion. Is AI a good thing, or bad thing? What are the priorities? What are the risks? How will it affect me and my family?</p><p>I am not sure we are going to get a coherent national AI strategy that will rationalize all these narratives. At the moment it seems we are letting evolution take its course, and will all just have to witness the outcome. After that, history can deal with the narratives.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/p/the-4-ai-narratives-no-wonder-we?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://anchoringdata.com/p/the-4-ai-narratives-no-wonder-we?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Data Literacy Is a Requirement for Generative AI Solutions]]></title><description><![CDATA[Educating organizations about AI Literacy is essential as more and more GenAI projects get the green light.]]></description><link>https://anchoringdata.com/p/data-literacy-is-a-requirement-for</link><guid isPermaLink="false">https://anchoringdata.com/p/data-literacy-is-a-requirement-for</guid><dc:creator><![CDATA[Malcolm Chisholm]]></dc:creator><pubDate>Fri, 14 Nov 2025 14:17:49 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!g5Ta!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F154fdee6-e076-4b6b-a92a-a8495e1a0756_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g5Ta!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F154fdee6-e076-4b6b-a92a-a8495e1a0756_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g5Ta!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F154fdee6-e076-4b6b-a92a-a8495e1a0756_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!g5Ta!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F154fdee6-e076-4b6b-a92a-a8495e1a0756_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!g5Ta!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F154fdee6-e076-4b6b-a92a-a8495e1a0756_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!g5Ta!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F154fdee6-e076-4b6b-a92a-a8495e1a0756_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g5Ta!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F154fdee6-e076-4b6b-a92a-a8495e1a0756_1536x1024.png" width="400" height="266.6666666666667" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/154fdee6-e076-4b6b-a92a-a8495e1a0756_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1536,&quot;resizeWidth&quot;:400,&quot;bytes&quot;:1665452,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/178888887?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7adb29da-381b-4d6e-b43d-f77e893756c3_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!g5Ta!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F154fdee6-e076-4b6b-a92a-a8495e1a0756_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!g5Ta!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F154fdee6-e076-4b6b-a92a-a8495e1a0756_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!g5Ta!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F154fdee6-e076-4b6b-a92a-a8495e1a0756_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!g5Ta!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F154fdee6-e076-4b6b-a92a-a8495e1a0756_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As Generative AI (GenAI) gets more and more popular, it is crucial for enterprises to begin to properly grasp true GenAI capabilities - especially as the hype and hopes around the technology grows.&nbsp; The Large Labguage Model (LLM) solutions are certainly breathtaking, but they have significant real-world limitations to be aware of. The biggest such limitation is data.&nbsp;</p><p>Never before have people been able to chat with data and get answers back the way they can now.&nbsp; It is certainly true that educating organizations about AI Literacy is essential as more and more GenAI projects get the green light.&nbsp; AI Literacy is the understanding of the concepts of AI as well as the major capabilities of current AI technologies. But the main driver of any Generative AI solution is data, as is clearly shown the technically complex data pipelines that get built in every GenAI project. Ultimately, an AI solution will only be as good as the data used to train it (we will use &#8220;train&#8221; here the broader sense, covering context windows, fine-tuning and RAG).&nbsp; Therefore, before we can even engage in AI literacy, we need to make sure there is an adequate foundation of Data Literacy.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Anchoring Data! This is a <strong>new</strong> publication dedicated to shine a light on knowledge about what data is and how it is managed. <em>Please subscribe and join me in this data journey.</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h4>The Need for Data Literacy</h4><p>Data Literacy is similar to AI Literacy. It involves understanding the concepts of data governance and data management. But it also includes understanding of the data assets of the enterprise, including the important contextual nuances of these assets.</p><p>Let&#8217;s go over one of the most basic scenarios - getting a complex data question answered.&nbsp; Without GenAI, this typically requires finding the right people, looking for the appropriate data assets, looking at BI reports, meetings, etc.&nbsp; This laborious process has historically been the only way for the business to get answers.&nbsp;</p><p>But if an enterprise has a high level of Data Literacy, they can speed things up substantially because they will have a greater understanding of the &#8220;who, what, where, when and how&#8221; of the people, processes and data assets involved.&nbsp; More advanced enterprises will have robust Data Catalogs in place, which act as a road map to the data, and will further optimize the discovery and understanding of data.&nbsp;</p><h4>Human vs. AI</h4><p>The goal of GenAI solutions is to get questions answered automatically, quickly and accurately.&nbsp; The above manual method has one major advantage over an automated search with an AI chatbot.&nbsp; When people talk to other people, they can generally easily communicate or intuit context.&nbsp; In order for an AI Q&amp;amp;A chatbot to properly function, context needs to be explicitly included in the training data.&nbsp; Properly accounting for nuance and context in data is crucial to any GenAI project, as the models should be trained with a specific purpose in mind.&nbsp; Without Data Literacy in place, everything gets very murky very quickly. There may be tempting sources of data in SharePoint Folders, Data Lakes, old Data Warehouses, and more, but how can we know if they are a match for the GenAI use case at hand?&nbsp; We need business professionals, who are very Data Literate, who can steer us to the right, accurate, timely, relevant data for a particular GenAI use case.&nbsp;</p><h4>Why It Matters</h4><p>Unfortunately, if this is not done, then you can expect your GenAI solution to hallucinate, and give untrustworthy answers.&nbsp; It will not be the fault of the technology &#8211; it will be the fault of the data that was used. As stated before, context is king, and context will play a big role in the success of GenAI projects. The best way to make context explicit is for a GenAI project to work with Data Literate business professionals, who understand high quality, accurate, timely, and relevant data.&nbsp;</p><p>Once the foundation of Data Literacy has been laid down, and accompanied with a robust Data Catalog, then enterprises will have a much higher probability of successful GenAI projects.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share Anchoring Data&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://anchoringdata.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share Anchoring Data</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Data Has Become Reality]]></title><description><![CDATA[It becomes very difficult to accept the notion that data actually is reality, and that this reality dominates modern life.]]></description><link>https://anchoringdata.com/p/data-has-become-reality</link><guid isPermaLink="false">https://anchoringdata.com/p/data-has-become-reality</guid><dc:creator><![CDATA[Malcolm Chisholm]]></dc:creator><pubDate>Thu, 06 Nov 2025 18:32:15 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!oaQi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df94a11-09f9-4cd1-a4f9-5eef47dd13cd_3840x2160.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oaQi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df94a11-09f9-4cd1-a4f9-5eef47dd13cd_3840x2160.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oaQi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df94a11-09f9-4cd1-a4f9-5eef47dd13cd_3840x2160.jpeg 424w, https://substackcdn.com/image/fetch/$s_!oaQi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df94a11-09f9-4cd1-a4f9-5eef47dd13cd_3840x2160.jpeg 848w, https://substackcdn.com/image/fetch/$s_!oaQi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df94a11-09f9-4cd1-a4f9-5eef47dd13cd_3840x2160.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!oaQi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df94a11-09f9-4cd1-a4f9-5eef47dd13cd_3840x2160.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oaQi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df94a11-09f9-4cd1-a4f9-5eef47dd13cd_3840x2160.jpeg" width="532" height="299.25" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9df94a11-09f9-4cd1-a4f9-5eef47dd13cd_3840x2160.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:532,&quot;bytes&quot;:5010938,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/178199141?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df94a11-09f9-4cd1-a4f9-5eef47dd13cd_3840x2160.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oaQi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df94a11-09f9-4cd1-a4f9-5eef47dd13cd_3840x2160.jpeg 424w, https://substackcdn.com/image/fetch/$s_!oaQi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df94a11-09f9-4cd1-a4f9-5eef47dd13cd_3840x2160.jpeg 848w, https://substackcdn.com/image/fetch/$s_!oaQi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df94a11-09f9-4cd1-a4f9-5eef47dd13cd_3840x2160.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!oaQi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df94a11-09f9-4cd1-a4f9-5eef47dd13cd_3840x2160.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A common view of data is that it is a record of an observed fact. This is the data of science, and what the economic data we are deluged with every day is supposed to be. Because almost everyone thinks that this is what data is, it becomes very difficult to accept the notion that data actually is reality, and that this reality dominates modern life.</p><p>But this is in fact the case. Consider what concerns us every day of our lives: our money, mortgage, insurance policy, bank account, driving license, credit cards, overdrafts, examination results, to name a few. These are not records of observed fact in the physical world. They are actual realities.</p><p>If I get an overdraft notice in the mail, and I tear it up, does that mean the overdraft no longer exists? If I lose the plastic driving license from my wallet, does that mean I have lost my driving privileges? Not at all. The bank will happily send me another overdraft notice, and I can reapply to the State of Florida (in my case) for another plastic driving license.</p><h4>The Case of The Bank Account</h4><p>The situation becomes clearer when we think of a bank account. The balance in my bank account is known with infinitely greater accuracy than our best estimate of the speed of light. In fact, it is exact, not accurate. It is data, but it is not data recorded about some physical reality, like the temperature inside my refrigerator. There is no corresponding reality of my bank account. It itself is reality.</p><p>The reality of my bank account is kept intact by the computerized systems that manage my bank account. No human is actually managing my bank account. This differs from the time before computers when clerical staff updated ledger books to manage bank accounts.</p><p>This is possible because the bank and I agree on the rules by which my bank account is managed. Or at least I think I agreed when I signed the contract to open the bank account - albeit without really reading the contract very thoroughly. As for the bank, I had my account opened by one of their employees, who I assume was authorized to represent the bank. I have no idea if this individual was familiar with the exact set of rules by which my account would be managed. Anyway, it was all in the contract &#8211; I think. Furthermore, these rules have all been instantiated in the software environment that manages my bank account in the computerized hardware infrastructure in which it ultimately resides. I just hope it is not too buggy.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Hi! I&#8217;m Malcolm Chisholm and this is a <strong>new</strong> publication dedicated to providing relevant information on data. I aim to shine a light on knowledge about what data is and how it is managed with each of my posts. <em>Please subscribe and join me in this data journey.</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h4>Philosophical Perspective</h4><p>There have been philosophers who have taken an interest in this kind of thing and tried to explain it. Perhaps the most notable is John Searle in his book &#8220;The Construction of Social Reality&#8221;. For Searle, realities like money are just an extension of biology. He states that symbolism is a biological capacity to make something mean or express something. As someone with a Ph.D. in biology, and who studied under some of the most famous luminaries in the field, I was astonished to learn this. Searle goes on to claim &#8220;there is a continuity in collective behavior between lions attacking a hyena and the Supreme Court making a collective decision&#8221;. Maybe I was asleep when this was being taught. No doubt there are biologists who would support the idea, but it seems difficult to prove scientifically.</p><p>At least Searle tried to provide an explanation of how social reality came to be. However, I do not think we necessarily have to wait for such an explanation to appreciate that a major class of data is reality.</p><h4>Data As Reality</h4><p>One surprising aspect of data being reality is that types of things that have no equivalent in physical reality are represented by this data. I worked with Asset Backed Securities, which nearly blew up the global economy in 2008. There is no such thing as an Asset Backed Security in the physical world. It is a bond backed by a pool of receivables such as mortgages or auto loans. There are legal documents for the bond which describes how it is supposed to function. These rules can then be used to develop software to sustain the bond. Features such as a payment priority waterfall are characteristics of the bond. There is no equivalent in the physical universe.</p><h4>What Can Go Wrong</h4><p>We take the physical universe for granted. It works according to metaphysical rules that apply even when things get chaotic, such as in natural disasters. The reality in which data exists differs, however, as the equivalent of metaphysics is whatever humans decide it is.</p><p>For instance, in the real world the first law of metaphysics states that a given entity cannot both have and not have a given attribute in a given relationship at a given time. This is the Law of Noncontradiction. But an Asset Backed Security may have a bond rating of AAA when a significant number of the underlying loans in it are in default. Or the same loans may be pledged not just to one Asset Backed Security, but to many &#8211; something known as rehypothecation.</p><p>So yes, much of the current reality we have to deal with may be constituted as data, but it can go wonky, like the world of Alice in Wonderland where logic falls apart. Sometimes this is due to internal inconsistencies, and at other times it can be due to fraud.</p><p>The pervasiveness of data as reality in the modern world is perhaps the ultimate argument for good data governance. However, the whole concept goes against the idea of modernity derived from The Enlightenment, which is that there is only time, space, matter, and energy &#8211; essentially what Searle was positing. So we may need to wait until the time of Modernity has passed to get the true level of data governance we need.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share Anchoring Data&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://anchoringdata.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share Anchoring Data</span></a></p>]]></content:encoded></item><item><title><![CDATA[Understanding Data Quality: The Problem of Understanding Data]]></title><description><![CDATA[We all want good quality in the goods and services we consume, but what does &#8220;quality&#8221;mean when it comes to data?]]></description><link>https://anchoringdata.com/p/understanding-data-quality-the-problem</link><guid isPermaLink="false">https://anchoringdata.com/p/understanding-data-quality-the-problem</guid><dc:creator><![CDATA[Malcolm Chisholm]]></dc:creator><pubDate>Thu, 30 Oct 2025 11:30:52 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!gWsl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34dd636c-9ef7-4df1-927e-ecac83a9a2d1_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gWsl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34dd636c-9ef7-4df1-927e-ecac83a9a2d1_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gWsl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34dd636c-9ef7-4df1-927e-ecac83a9a2d1_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!gWsl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34dd636c-9ef7-4df1-927e-ecac83a9a2d1_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!gWsl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34dd636c-9ef7-4df1-927e-ecac83a9a2d1_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!gWsl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34dd636c-9ef7-4df1-927e-ecac83a9a2d1_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gWsl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34dd636c-9ef7-4df1-927e-ecac83a9a2d1_1536x1024.png" width="448" height="298.7692307692308" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/34dd636c-9ef7-4df1-927e-ecac83a9a2d1_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:448,&quot;bytes&quot;:3491609,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/177526108?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34dd636c-9ef7-4df1-927e-ecac83a9a2d1_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gWsl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34dd636c-9ef7-4df1-927e-ecac83a9a2d1_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!gWsl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34dd636c-9ef7-4df1-927e-ecac83a9a2d1_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!gWsl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34dd636c-9ef7-4df1-927e-ecac83a9a2d1_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!gWsl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34dd636c-9ef7-4df1-927e-ecac83a9a2d1_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is a very broad topic and Anchoring Data will be providing posts on different aspects of data quality going forward. It is also a very important topic. We all make decisions based on data in our personal and professional lives. In doing so we are consciously or unconsciously judging &#8211; or merely accepting - the quality of the data involved. Data quality has a direct impact on our lives and like it or not we have to be concerned about it.</p><p>Now, you might think that &#8220;data quality&#8221; simply means the data is right and not wrong, and while that is a valid point, data quality involves a lot of other factors too. In fact, data gets blamed for being of poor quality when it actually is not. One of the main reasons this happens is that people do not understand the data, or are not able to understand the data, they are trying to use. They will typically say the data is &#8220;bad&#8221;. Such statements eventually get back to the technical administrators of data who may then try to correct nonexistent problems in the data. It all adds to the &#8220;data mess&#8221; we see in many organizations.</p><h4>Quality vs. Understanding</h4><p>Let us suppose there is such a thing as &#8220;perfect&#8221; data. It is quite possible to have access to perfect data but not understand what the data actually represents. Trying to use the data without understanding it will almost certainly create problems due to misapplication of the data.</p><p>Of course, there probably is no such thing as &#8220;perfect&#8221; data, and the situation is even worse with the imperfect data in the real world.</p><p>A good example of this is the US Bureau of Labor Statistics, which publishes two estimates of employment every month. One, the Establishment Survey, counts each person repeatedly for each job they hold. The other, the Household Survey, just counts a person once if they are employed.</p><p>The Establishment Survey is the &#8220;Headline Jobs Number&#8221; that appears every month in the media and causes the Stock Market to go up or down. It is also used by a lot of financial analysts. Yet all these people seem blissfully unaware that the Establishment Survey is overstating the number of people who are employed.</p><p>It seems fairly obvious that if someone is making decisions based on data like the Establishment Survey they should be aware of what the data means. And yet, it seems very often the data is just taken at face value.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Hi! I&#8217;m Malcolm Chisholm and this is a <strong>new</strong> publication dedicated to shine a light on knowledge about what data is and how it is managed. <em>Please subscribe and join me in this data journey.</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h4>The Role of Definitions</h4><p>All this is quite easy to say, but in practice it can be very difficult. How do you go about understanding data? Well, data should be defined, but alas a lot of data is not defined, or has very poor definitions.</p><p>Technical people, like programmers, do not enjoy creating documentation, and quite frankly are often not good at it. There used to be technical writers who would do write data definitions, but these positions have largely been eliminated in recent years. There are more technical roles, like data modelers who design databases, and who also capture data definitions. However, their definitions are often in specialized tools that are inaccessible to ordinary users.</p><p>Furthermore, we cannot expect too much from a definition. A definition is important as it is an explanation of the essence of something &#8211; for data this would be what it means to non-technical businesspeople. Other information is usually missing, e.g. if a data element is calculated, the actual calculation is often not in the definition.</p><p>Even more alarming is that for data, the definition can be what the data is supposed to be, not what it actually is. For instance, the data element &#8220;Address Line 3&#8221; might be defined as &#8220;To be used for the third line of a street address, but not the city, higher geopolitical unit, or post code of the address&#8221; and in reality be used to house fax numbers. Such things can happen with data.</p><h4>What&#8217;s In the Data?</h4><p>The philosophers tell us that definitions give us intension but not extension. What is &#8220;extension&#8221;? It is the range of instances that go into data. Suppose we have a database table that contains information about Employees. On closer inspection we find that this table contain US Employees but excluding employees Puerto Rico, Guam, and the Marshall Islands. We also find that the table only gets updated every 6 months.</p><p>In this example we may have a perfect definition of what an employee is, but the &#8220;extension&#8221; of the table seems to have limitations that could be important. Unfortunately, people using data only looking at the definitions can get a false sense of security from that.</p><h4>Where Did the Data Come From?</h4><p>Yet another issue that is becoming increasingly important is the provenance of the data. A good deal of data is now purchased by organizations, or scraped from the Internet by them. Years ago, the only data most organizations had was what they produced from their operational systems.</p><p>Obviously, some sources of data are going to be more reliable than others. Many data vendors have established reputations that they want to preserve and so they are going to be careful about the quality of the data they sell. Other sources could be a lot dodgier, and we now have the prospect of datasets assembled by AI, which might hallucinate data into existence.</p><p>Knowing something of where the data comes from is very likely to affect how it will be used, but, again, this is largely ignored and the data is just taken at face value.</p><h4>Don&#8217;t Blame Data Quality if You Don&#8217;t Understand the Data</h4><p>Alas, people will often not bother to make the effort to get a level of understanding they need to use the data. When things go wrong &#8211; which they almost certainly will &#8211; these people are not going to blame themselves. Instead, they are going to blame the data and say it had poor data quality.</p><p>If anyone has a need to use data, they should think through the assumptions they are making about the data and its intended use, and understand the data to the level that is needed. This requires time and effort, and might be difficult. Maybe it is not even possible to fully understand the data to the level required. But at least an informed decision can be made whether to accept the risk or not.</p><p>In the end, understanding the data we are using is a personal responsibility, and it is ingenuous to blame the data if we ignore this responsibility.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/p/understanding-data-quality-the-problem?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://anchoringdata.com/p/understanding-data-quality-the-problem?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Are You Data?]]></title><description><![CDATA[Is Data Part Of What We Are Made Of?]]></description><link>https://anchoringdata.com/p/are-you-data</link><guid isPermaLink="false">https://anchoringdata.com/p/are-you-data</guid><dc:creator><![CDATA[Malcolm Chisholm]]></dc:creator><pubDate>Thu, 23 Oct 2025 12:02:40 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!8am2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa697a612-5c4b-41e3-bc11-861fd469b02d_3840x2160.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Have you ever wondered if you are data? Probably not, but despite this sounding like a ridiculous question there are in fact people who indirectly think they really are data. Let&#8217;s explore why this is.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8am2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa697a612-5c4b-41e3-bc11-861fd469b02d_3840x2160.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8am2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa697a612-5c4b-41e3-bc11-861fd469b02d_3840x2160.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8am2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa697a612-5c4b-41e3-bc11-861fd469b02d_3840x2160.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8am2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa697a612-5c4b-41e3-bc11-861fd469b02d_3840x2160.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8am2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa697a612-5c4b-41e3-bc11-861fd469b02d_3840x2160.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8am2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa697a612-5c4b-41e3-bc11-861fd469b02d_3840x2160.jpeg" width="502" height="282.375" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a697a612-5c4b-41e3-bc11-861fd469b02d_3840x2160.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:502,&quot;bytes&quot;:2015286,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/176863257?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa697a612-5c4b-41e3-bc11-861fd469b02d_3840x2160.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8am2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa697a612-5c4b-41e3-bc11-861fd469b02d_3840x2160.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8am2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa697a612-5c4b-41e3-bc11-861fd469b02d_3840x2160.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8am2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa697a612-5c4b-41e3-bc11-861fd469b02d_3840x2160.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8am2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa697a612-5c4b-41e3-bc11-861fd469b02d_3840x2160.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>What is Data?</h4><p>If anyone is to be reduced to data, we should first have a good idea of what data is. Data has the following properties:</p><p>1. It is stored in a physical location.</p><p>2. It is composed of symbols, where a symbol is a sign that by convention represents something else.</p><p>3. Data exists to be processed. There has to be some kind of mechanism that uses the data as an input and produces an output. The nature of the output depends on the nature of the data that is input and the processing mechanism. There may be additional properties, but that seems to be enough to establish a distinction between data and everything else.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Hi! I&#8217;m Malcolm Chisholm and this is a <strong>new</strong> publication dedicated to providing relevant information on data for the widest possible audience (Yes, for everyone). I aim to shine a light on knowledge about what data is and how it is managed with each of my posts. <em>Please subscribe and join me in this data journey.</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h4>Is DNA Data?</h4><p>This brings us to the question of what DNA (deoxyribose nucleic acid) is. In a biochemical sense it is an acid composed of an alternating sugar (deoxyribose) phosphate &#8220;backbone&#8221; with nitrogenous bases of four kinds attached to every repeating sugar-phosphate monomer. DNA exists as two paired antiparallel molecules in normal cells, forming a double helix. I could go on, but it is irrelevant.</p><p>Just as the meaningful content of a book cannot be reduced to paper and printer&#8217;s ink, neither can the information content of DNA be reduced to a heap of biochemical jargon. Let&#8217;s look at how it lines up with data&#8217;s properties.</p><p>1. DNA is physical and it stores information.</p><p>2. This one is a bit trickier.</p><p>a. DNA is composed of symbols. These are sequences of 3 nucleotides. Each sequence represents an amino acid, or controls processing. There are 64 possible combinations of sequences, and 20 amino acids that are normally found in living organisms.</p><p>b. There is a convention of what each 3 nucleotide sequence represents what amino acid (or what processing signal) that seems to hold with a few variations across all living organisms. This is called the &#8220;genetic code&#8221;.</p><p>3. DNA&#8217;s information content gets processed in protein synthesis. A sequence of DNA is copied as RNA, which preserves the information content. The RNA is then used to generate a protein &#8211; a string of amino acids &#8211; by the cellular machinery that exists for this purpose. So, data is input to the process, and a protein is the output.</p><p>From this, we can conclude that DNA holds data in the same sense that the solid state drive on my PC holds data.</p><h4>Genotype vs. Phenotype</h4><p>The &#8220;genotype&#8221; is the entire set of genetic information (data) that an organism possesses. The &#8220;phenotype&#8221; is the actual individual organism as it appears. Think of the genotype as the recipe for an omelette and the phenotype as the omelette on your plate.</p><p>Which brings us to popular conceptions of DNA. It is now quite common for organizations to say &#8220;It&#8217;s in our DNA&#8221;, which is somewhat weird as organizations do not have DNA. This seems to reflect a popular belief that people are wholly or largely the product of their DNA &#8211; the product of their data.</p><p>However justified of not this belief is, it seems to be widely held. Indeed, the popular conception of DNA elevates it to the level of magic pixie dust. There are countless science fiction stories that revolve around DNA, with DNA producing rather outlandish scenarios. In the realm of reality, DNA is used to establish identity in legal matters, and establish a predisposition to particular diseases. Other popular beliefs exist, some of them quite toxic, such as DNA being responsible for determining intelligence levels.</p><h4>But Is It So?</h4><p>If we accept that DNA is data, then we can perhaps get a better appreciation of it by looking at it from a data perspective.</p><p>Data exists to be processed. The cellular machinery that processes DNA is incredibly complex. Once the outputs are produced, they enter into metabolic pathways that are also complex and eventually lead to a final end state. Even this is a gross simplification as there are yet other biological processes that are involved.</p><p>Data is similar. Think of a payroll system that begins with data about an employee which gets processed by the system to finally end up with funds deposited in the employee&#8217;s bank account, as well a number of deductions that get sent to other bank accounts. It is not the data about the employee alone that is solely responsible for what happens.</p><p>Similarly, it is not as if we have a gene for &#8220;X&#8221; and &#8220;X&#8221; just happens to us. There must be specific metabolic pathways (and more) to actually produce &#8220;X&#8221;. I can add new data elements to my dataset for the employee payroll, but nothing will be done with them unless I add programming logic to process them.</p><h4>How Human Is DNA?</h4><p>Some data is generic in the sense that it is widely used in many different applications. A company will have many uses for its customer data and product data. This seems to have a parallel in DNA. Humans have about 30,000 genes that code for proteins, as do mice, but only about 300 of these are uniquely human &#8211; the rest (about 29,700) are just the same as the mice have.</p><p>However, there is a lot of controversy around this point. The amount of DNA in the human genome that codes for proteins is only a small portion of the DNA in the human genome. The remaining &#8220;junk&#8221; DNA does seem to have some functionality, like regulating gene expression, but there is not as clear a picture as for protein coding genes.</p><p>But maybe DNA is old hat. Dr Michael Levin&#8217;s research team at Tufts University has found that the electrical fields of cells may be just as important. These electrical fields form something like collective cellular networks which exhibit a form of&nbsp;proto cognition - a distributed intelligence that allows tissues to make decisions about shape, repair, and function.&nbsp; It is difficult to see how this could function without data, but any data involved almost certainly cannot be in DNA.</p><h4>So, Are You Data?</h4><p>Yes, we do have a data component &#8211; DNA &#8211; but it is only part of what makes each one of us what we are. And it may be rather a mundane component, only responsible for enzymes and structural proteins. In the end, nothing can be just data because data by itself is simply inert.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share Anchoring Data&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://anchoringdata.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share Anchoring Data</span></a></p><h2></h2>]]></content:encoded></item><item><title><![CDATA[Why Data Cannot be Understood Scientifically]]></title><description><![CDATA[If data is so important to us, then we need to be able to understand it.]]></description><link>https://anchoringdata.com/p/why-data-cannot-be-understood-scientifically</link><guid isPermaLink="false">https://anchoringdata.com/p/why-data-cannot-be-understood-scientifically</guid><dc:creator><![CDATA[Malcolm Chisholm]]></dc:creator><pubDate>Thu, 16 Oct 2025 13:01:26 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!nblH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05b14c28-9774-4b88-910f-5dcd1ebe4fb6_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If data is so important to us, then we need to be able to understand it. The modern prejudice is that things can only be understood scientifically. But is this really true for data?</p><p>Before we get into the topic, I feel I should at least provide some scientific credentials. I studied Zoology at Oxford University &#8211; the whole range, from molecular genetics to field ecology. I had the privilege of being personally tutored by H. B. D. (&#8220;Bernard&#8221;) Kettlewell, who showed selective predation on Peppered Moth polymorphs. I have handled specimens of a Dodo and a Tasmanian Wolf, and was taught to program by Richard Dawkins. During my vacations I worked in research stations on projects devoted to terrestrial ecology and freshwater biology. My Ph.D., from Bristol University, is in experimental field ecology trying to assess the level of competition within populations of insect species.</p><p>Now back to data.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nblH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05b14c28-9774-4b88-910f-5dcd1ebe4fb6_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nblH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05b14c28-9774-4b88-910f-5dcd1ebe4fb6_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!nblH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05b14c28-9774-4b88-910f-5dcd1ebe4fb6_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!nblH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05b14c28-9774-4b88-910f-5dcd1ebe4fb6_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!nblH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05b14c28-9774-4b88-910f-5dcd1ebe4fb6_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nblH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05b14c28-9774-4b88-910f-5dcd1ebe4fb6_1536x1024.png" width="486" height="324.1112637362637" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/05b14c28-9774-4b88-910f-5dcd1ebe4fb6_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:486,&quot;bytes&quot;:3867350,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/176275895?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05b14c28-9774-4b88-910f-5dcd1ebe4fb6_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nblH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05b14c28-9774-4b88-910f-5dcd1ebe4fb6_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!nblH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05b14c28-9774-4b88-910f-5dcd1ebe4fb6_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!nblH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05b14c28-9774-4b88-910f-5dcd1ebe4fb6_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!nblH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05b14c28-9774-4b88-910f-5dcd1ebe4fb6_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>The Popular View of Data</h4><p>In the popular culture, to the extent that data is thought about at all, it seems to be as something that is &#8220;scientific&#8221;. Maybe it is indirectly so, because data is housed in technology, and technology is ultimately reducible to science. Thinking about it from another angle, in recent years many people have said that they are &#8220;data-driven&#8221;. Science is all supposed to be based on data, and so &#8220;data-driven&#8221; individuals can rise above reproach by appearing to align to science &#8211; supposedly the ultimate source of all truth. Or it was, as trust in &#8220;science&#8221; now seems to be declining.</p><p>What does it mean for something like data to be scientific? It would mean that the particular peculiarities of individual items of data, should be explainable by what we know about data as a class. If I find a tick on my skin while walking in the countryside, I know that it is likely to suck my blood because I know all ticks suck blood. I can predict the behavior of the individual from my understanding of the behavior of the class.</p><p>The common conception appears to be that the same applies to data. The properties and behaviors of data as a class should apply to all the individual instances of data. And the general public seems to expect that there are experts, whether scientists or technologists, who have this knowledge which in turn enables them to manage data successfully. Of course, no expert is ever going to dispel this illusion, and no ordinary person wants to be stuck with extra work to manage data. Leave that to the specialists.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Hi! I&#8217;m Malcolm Chisholm and this is a <strong>new</strong> publication dedicated to providing relevant information on data for the widest possible audience (Yes, for everyone). I aim to shine a light on knowledge about what data is and how it is managed with each of my posts. Please subscribe and join me in this data journey.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h4>The Problem of &#8220;Why&#8221;</h4><p>Unfortunately, it is not that simple. Let me give you an example. Many years ago I was asked to redesign a database table that housed information about financial instruments &#8211; stocks, bonds, and derivatives. Each record in the table has a identifier &#8211; a database column that that was unique for each record (so &#8220;identified&#8221; it), and was assigned automatically by the system managing the table. When I looked closely at this Identifier I saw it had a rather strange pattern. The Identifier was made up of 8 digits. The first three seemed to be random, and the following five digits seemed to be sequential numbers.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1YVJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2431421f-4891-4383-9287-3cc6e2299119_432x238.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1YVJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2431421f-4891-4383-9287-3cc6e2299119_432x238.png 424w, https://substackcdn.com/image/fetch/$s_!1YVJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2431421f-4891-4383-9287-3cc6e2299119_432x238.png 848w, https://substackcdn.com/image/fetch/$s_!1YVJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2431421f-4891-4383-9287-3cc6e2299119_432x238.png 1272w, https://substackcdn.com/image/fetch/$s_!1YVJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2431421f-4891-4383-9287-3cc6e2299119_432x238.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1YVJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2431421f-4891-4383-9287-3cc6e2299119_432x238.png" width="230" height="126.71296296296296" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2431421f-4891-4383-9287-3cc6e2299119_432x238.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:238,&quot;width&quot;:432,&quot;resizeWidth&quot;:230,&quot;bytes&quot;:23209,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/176275895?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2431421f-4891-4383-9287-3cc6e2299119_432x238.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1YVJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2431421f-4891-4383-9287-3cc6e2299119_432x238.png 424w, https://substackcdn.com/image/fetch/$s_!1YVJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2431421f-4891-4383-9287-3cc6e2299119_432x238.png 848w, https://substackcdn.com/image/fetch/$s_!1YVJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2431421f-4891-4383-9287-3cc6e2299119_432x238.png 1272w, https://substackcdn.com/image/fetch/$s_!1YVJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2431421f-4891-4383-9287-3cc6e2299119_432x238.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Because my task was to redesign this table I needed to understand if this strange pattern in the Identifier was important or not. I looked at all the documentation that had been created for the system and its database but could find nothing. At this point I knew I was going to be forced to replicate the pattern in the new database table I was designing. I could not prove that it was irrelevant, so I would have to keep it.</p><h4>A Shocking Discovery</h4><p>Then I had an idea. I started to ask all the &#8220;old timers&#8221; in the company if they knew anything about the pattern in the Financial Instrument Identifier. Eventually, I found someone who had been part of the team that had designed the table many years ago.</p><p>He told me that originally each Financial Instrument record was simply assigned a sequential number as its Identifier &#8211; 1, 2, 3, and so on for each record. However, the database software used the identifier to calculate the physical location of the data on the hard disk on which that data was stored. It did this in such a way that the value of the Identifier directly generated an address on disk, so records with Identifiers of 1, 2, 3, and so on were written onto disk physically adjacent to each other. As a result, all the data was crowded in just one place on the disk. The read/write head used to access the data would be positioned over this area of the disk all the time &#8211; and the heat is generated would burn out the disk.</p><p>To avoid this, the first 3 digits of the Identifier were changed to be random. Now the data was spread out all over the disk, so it never burned out.</p><p>I asked when this was done and was told it was 1992. I then asked why this problem was never fixed in the database software, and was told &#8220;Oh, but it was fixed within a year, but we had no reason to go back and undo the random assignment of the first 3 digits of Identifier&#8221;.</p><h4>You May Never Know &#8220;Why&#8221;</h4><p>This episode has haunted me until the present day. It means that how data is designed and managed is the result of decisions made for reasons that are now completely unknown, and are actually unknowable if they were never documented, or the documentation is lost, or nobody can be contacted who remembers what happened. Also, the reasons for a decision may make no sense in the modern world. Hard disk drives were dominant in 1992, but today solid state drives are much more common. They cannot get burnt out by a read/write head because they do not have read/write heads.</p><h4>&#8220;Science&#8221; Does Not Apply</h4><p>Which brings us back to the point that data cannot always be fully understandable based on a scientific approach. As in the Financial Instrument example, data may have been carried through different hardware upgrades and now be in an environment quite different to its original environment. But the design of that original environment may have significant impacts on data that persist to this day. Data simply cannot be understood just by inspecting the data in its current technical setting.</p><p>I am not saying that data can never be understood from using a scientific approach. There may be many cases where this is quite possible. But I think I have shown that it is not always possible.</p><h4>Does It Matter?</h4><p>It does, because existing data gets used for new purposes, and is often migrated through new generations of technology. The idiosyncrasies present in the original data may affect new ways in which it is used. Further, when data is migrated to a more modern technology, the idiosyncrasies are not removed, but are retained in the new environment because nobody knows why they exist, and everyone is too afraid to remove them.</p><p>The result is not sudden catastrophic failures &#8211; though that can happen - but suboptimal outcomes, a lot of additional effort required, and above all a slow sclerosis over time that impacts an organization&#8217;s ability to adapt and change. This runs counter to the notion that technology is always improving things, and we experience continual &#8220;progress&#8221;.</p><p>Can or will anything be done about this? It is very unlikely. Documentation of design decisions and technical environments seems to be at a low ebb today, and there seems to be a growing expectation that AI can do it for us, so why bother. The delusion that data can be fully understood scientifically is likely to persist.</p>]]></content:encoded></item><item><title><![CDATA[How Data Became Everything]]></title><description><![CDATA[A Journey from Obscurity to Global Dominance, and Why the Journey Matters]]></description><link>https://anchoringdata.com/p/how-data-became-everything</link><guid isPermaLink="false">https://anchoringdata.com/p/how-data-became-everything</guid><dc:creator><![CDATA[Malcolm Chisholm]]></dc:creator><pubDate>Thu, 09 Oct 2025 20:53:05 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!PsFp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2450553-2ae2-44f8-8fb7-da49a85afbfe_1280x853.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PsFp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2450553-2ae2-44f8-8fb7-da49a85afbfe_1280x853.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PsFp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2450553-2ae2-44f8-8fb7-da49a85afbfe_1280x853.jpeg 424w, https://substackcdn.com/image/fetch/$s_!PsFp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2450553-2ae2-44f8-8fb7-da49a85afbfe_1280x853.jpeg 848w, https://substackcdn.com/image/fetch/$s_!PsFp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2450553-2ae2-44f8-8fb7-da49a85afbfe_1280x853.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!PsFp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2450553-2ae2-44f8-8fb7-da49a85afbfe_1280x853.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PsFp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2450553-2ae2-44f8-8fb7-da49a85afbfe_1280x853.jpeg" width="514" height="342.5328125" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f2450553-2ae2-44f8-8fb7-da49a85afbfe_1280x853.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:853,&quot;width&quot;:1280,&quot;resizeWidth&quot;:514,&quot;bytes&quot;:416898,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://malcolmchisholm2.substack.com/i/174863328?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2450553-2ae2-44f8-8fb7-da49a85afbfe_1280x853.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PsFp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2450553-2ae2-44f8-8fb7-da49a85afbfe_1280x853.jpeg 424w, https://substackcdn.com/image/fetch/$s_!PsFp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2450553-2ae2-44f8-8fb7-da49a85afbfe_1280x853.jpeg 848w, https://substackcdn.com/image/fetch/$s_!PsFp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2450553-2ae2-44f8-8fb7-da49a85afbfe_1280x853.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!PsFp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2450553-2ae2-44f8-8fb7-da49a85afbfe_1280x853.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Over a period of seventy years, data rose from being viewed as irrelevant, indeed despised, to the most valuable economic resource apart from energy. In fact, much of the transition occurred in just the 35 years between 1990 and 2025. </p><p>How and why did this happen? For those of us who lived it, who were always proponents of data, it is an outcome far beyond anything we could have dreamt of. Let us try to understand what happened.</p><h3>In The Beginning</h3><p>We will stipulate that &#8220;data&#8221; is the stored representation of facts held in computerized systems. This is not a broad enough definition, but it will suffice for the present discussion. </p><p>There was a time when data, using this definition, did not exist for normal economic affairs. Record-keeping &#8211; the tracking of economic things and events &#8211; was done using paper and ink in ledger books. Clerks, sometimes called &#8220;computers&#8221;, performed the record keeping. This state of affairs persisted for centuries until 1965.</p><p> It was in 1965 that the IBM Series 3 range of computers became widely available to organizations. There had been computers before that, but they had mostly been confined to defense, intelligence, and institutional research. </p><p>The new computers could be purchased or rented. Renting was known as &#8220;time sharing&#8221;. A company that could not afford to buy a computer could rent time on a computer used by another organization when the latter was not using it. Time sharing quickly and vastly expanded the range of businesses able to use computers.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Hi! I&#8217;m Malcolm and this is a <strong>new</strong> publication dedicated to providing relevant information on data for the widest possible audience (Yes, for everyone). I aim to shine a light on knowledge about what data is and how it is managed with each of my posts. Please subscribe and join me in this data journey.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h3>The Programming Revolution</h3><p>The new computers were used to automate hitherto manual processes of record- keeping. For instance, updating a bank&#8217;s ledger book when a customer came into a branch to withdraw or deposit money was replaced by a program. The program was a set of instructions for how to update the now electronic record &#8211; the data &#8211; about the customer&#8217;s banking activities. </p><p>Once a program was successfully written it could be executed an infinite number of times and would perform the same set of steps each time. A human clerk could make mistakes, but a fully debugged program would not. If a bank based on manual processes expanded its customer base it would need to hire more human clerks to keep up, but if it used computers it could scale up much more cheaply with more computing power.</p><h3>The Age of Operational Systems</h3><p>The incontrovertible economic advantages of shifting from manual processing to computerized automation unleashed a massive shift in which nearly every clerical process was moved into computerized architecture. This was the paradigm while mainframes were the dominant architecture from 1965 to 1982. </p><p>Where was data in all of this? Data was certainly recognized as being vital. In fact, the &#8220;Technology&#8221; or &#8220;Information Technology&#8221; departments of those days were often called &#8220;Data Processing&#8221;. But the data was simple and viewed as what was necessary to run the previous manual operations. Each manual process got converted in isolation from all others to a computer system. Its data was its data. There was no interaction with any other data or any other system. Today this is known as a &#8220;siloed&#8221; architecture, and was called &#8220;stovepipes&#8221; in the past. </p><p>The siloed systems made data very uninteresting. There was not much to think about other than what the system needed in terms of data.</p><h3>The Rise of Databases</h3><p>And yet there were people who thought about data. They realized that although data was held as individual files, some of this data was the same in different files, and some data had relationships to different data in other files. This gave rise to the notion of a &#8220;database&#8221;, a set of related data all stored in one place. A database would be more efficient, and allow more complex processing more easily than the standalone files that were in use up to that point.</p><p>It was all very nice in theory, but it was just theory. Eventually, however, relational databases emerged. As with computer hardware, these tended to be developed for defense and intelligence, then emerging into the mainstream economy with financial services being early adopters. There were database architectures other than relational databases, but relational databases became dominant, so we will focus on them.</p><h3>Data Steps Out on Its Own</h3><p>Relational databases were complex enough to need to be designed well. Designing a data store was something new. E. F. (&#8220;Ted&#8221;) Codd who contributed a great deal to the theory of relational databases came up for rules of good relational database design for update processing. Later in the 1970&#8217;s Peter Chen developed a visual notation for what Codd had formulated, so a standard graphical representation of a database design became possible. This visible representation of the components of a database and their relationships was revolutionary. Purely logical, abstract, immaterial data structures could be much more easily managed by humans.</p><p>For the first time data was seen as something in its own right, rather than a byproduct of automation. Costly mistakes in database design quickly persuaded businesses to invest in data modeling &#8211; the design of databases. Data as a discipline had been established.</p><h3>Technology Advances</h3><p>In 1982 personal computers (&#8220;PC&#8217;s&#8221;) emerged. They spread like wildfire, throughout organizations of all kinds. The turbulence of the period from 1982 &#8211; 1987 saw many changes, including the adoption of relational databases in PC environments, especially the housing of databases on dedicated servers &#8211; specialized hardware dedicated to databases.</p><p>But by now there were very few manual processes that remained to be automated. That had been done, so attention shifted to improving what was in place and upgrading it to keep in synch with changes in the business. A computer program or a database structure very much reflects a need at a point in time. As a business evolves, so programs and databases need to evolve with them. This is actually far more difficult than creating another point in time solution from scratch, which is why we tend to see systems and databases being created anew rather than changed.</p><p>While this was going on, some people began to think of data as an asset that could be used outside of automated operations. During the late 1980&#8217;s and especially in the 1990&#8217;s this viewpoint slowly gained ground. The concept of data warehouses emerged. These were database environments that took data from automated systems, transformed it, integrated it, and organized the historical aspects of it. Data warehouses were strictly for analyzing and reporting data, and performed no process automation. Bill Inmon and Ralph Kimball contributed ideas to specialized design patterns for data warehouses that were widely adopted.</p><p>But it was not just data per se. New technologies emerged that supported data warehouses. These technologies made it easier to move data, to detect data quality issues, and to process historical changes in data.</p><h3>The Tide Had Turned</h3><p>The focus was now on using data from operational systems to manage organizations. One example is Sam Walton of Walmart who pioneered this approach beginning in the 1970&#8217;s, using data to understand customer behavior and inventory movement. The results were self-evident with Walmart growing from a small Arkansas retailer to one of the world&#8217;s largest companies.</p><p>The realization that data was not just a byproduct of automation, but could contain golden nuggets of information that might improve revenue or reduce expenses took hold in the 1990&#8217;s. Data-centricity was on the rise, and process-centricity was declining. But much more was yet to come.</p><h3>The Internet</h3><p>The major technological innovation of the 1990&#8217;s was the Internet. It coincided with the understanding of data as a strategic asset, and had a profound impact &#8211; slowly at first, but quickly accelerating.</p><p>Organizations found that they could interact with data outside of their proprietary technological environments. They could even reach individual consumers via the Internet. This was something new and strange that had never happened before. The volumes of data available via the Internet also grew rapidly, which was also unexpected. Much of this data was &#8220;unstructured data&#8221;, such as documents, images, video, and audio. It had not been used very much in traditional data processing, but it suddenly became important.</p><p>And then of course the excesses happened. Vast amounts of capital were fed into poorly thought-out ventures that had a vague plan for revolutionary disruption of some market or other. This capital allocation alone frightened the executives of big traditional well-established businesses, and they started pouring money into internal Internet projects, hoping to survive what they thought would be a massive onslaught from dot.com startups.</p><p>Few people really understood what was happening, and many believed the ridiculous value propositions of the ill-conceived startups, but in March 2000 it all came to an end with the NASDAQ market crash. The excesses were cleared out, and the hitherto oppressed &#8220;bricks and mortar&#8221; crowd exacted a terrible revenge, punishing the IT departments of their enterprises for all the money they had wasted.</p><h3>The Dark Age</h3><p>A real Dark Age followed from 2000 to 2005, with the events of 9/11 only adding to it. On the surface it seemed that everything had gone back to a much leaner version of the early 1990&#8217;s, with an acceptance of the importance of data, but little appetite to make any bold claims about it.</p><p>And yet, under the surface things were developing. There was a realization that the Internet was more than the &#8220;dot.com bust&#8221; and that there were now very successful companies, such as Google and Amazon, that had data at the heart of their business models. This was something totally new. Another realization was that data was its own thing that had its own special problems, needs, questions, solutions, and body of expertise. Data was not technology. Multiple data warehouse project failures had flawless technology but were caused by all kinds of problems in the data that were not understood or ignored.</p><p>This came to a head in 2005 where the new discipline of Data Governance burst onto the scene. It was now understood that if data was a valuable resource it had to be managed like one. Just as Human Resources departments oversaw the management of people in an enterprise, so Data Governance was to do the same for data.</p><h3>Big Data and The Cloud</h3><p>A big boost for Data Governance came with the 2008 Global Financial Crisis. Governments realized that the data they had been using for regularity and economic affairs was flawed. Despite vast amounts of money being spent by governments, none of them had been able to predict the GFC. Data now came under intense regulatory scrutiny. Exactly what that achieved is open to debate, but it did raise the profile of data in governmental circles. The perception of the importance of data gained even more ground.</p><p>Then another technological revolution happened. Echoing the rise of relational databases, new architectures for managing ultra large-scale datasets emerged. The needs of Defense and Intelligence to quickly process vast amounts of signals intelligence could not be met by traditional relational databases. So instead, completely new architectures were created using &#8220;junk hardware&#8221; that was cheap and could fail, but could easily be replaced. Parallel processing with built-in redundancy meant tasks could keep running and data would not be lost when junk hardware failed. The new architectures could scale simply by adding more hardware in a way not possible before. New kinds of databases held data in designs that were highly optimized for querying. Google and their kin also adopted and popularized this new paradigm that became known as &#8220;Big Data&#8221;.</p><p>The original idea of &#8220;Big Data&#8221; was that it could be used for ultra-large scale datasets, at the petabyte level. But the new technologies involved were very appealing, even for much smaller amounts of data. The experience with the Internet, combined with more and more sophisticated analytics meant that traditional data warehouses were no longer adequate. A data warehouse is built to satisfy known queries against data, and could take years to develop. But what happened when analytics were unpredictable, and needed to provide results in short timeframes?</p><p>The result was data lakes. These were environments that used the Big Data approach and technology. They could quickly ingest any dataset and integrate it with other data to be used in analytics based on tools that permitted rapid insights. </p><p>The connectivity to data lake environments increasingly became via the Internet. The Internet is often represented as a cloud, and this paradigm became called &#8220;Cloud Computing&#8221;. Companies found that it was cheaper to rent data storage and processing capacity from providers than to continue with their own highly expensive data centers.</p><p>At first it was difficult to move to the Cloud, but it got easier and by 2015 the vast migration was well under way. Organizations could now manage much more data, many more different kinds of data, and perform new kinds of analytics in these environments than they had been able to before. Data-centricity was firmly established. The number of organizations with data at the center of their business models rapidly expanded. By 2017 The Economist magazine declared that data was the &#8220;New Oil&#8221; &#8211; the world&#8217;s most valuable resource.</p><h3>Data Privacy</h3><p>Throughout all this time, data was not something that mattered to ordinary people. It sounded academic, or corporate, and not part of their normal lives. That began to change in 2013 when Edward Snowden leaked many classified NSA documents. Max Schrems, a law student in Austria, then sued Facebook in Ireland, complaining that Snowden&#8217;s leak showed that Schrems&#8217; personal data was being shared by Facebook with the NSA. Transfer of personal data from the European Union to the US had been covered by a Safe Harbor arrangement up to this point. In 2015 the European Court of Justice found in favor of Schrems, and the Safe Harbor provisions were invalidated.</p><p>The EU then came up with the General Data Protection Regulation (GDPR). This was a landmark. For a long time there had been a very tiny minority of professionals who were concerned about the management of personal information, but they had generally been ignored. Now data privacy was front and center, and individual people quickly became aware of its importance. Data &#8211; or at least personal data &#8211; became a major concern for ordinary people.</p><p>More legislation followed, like the California Consumer Protection Act, and commercial enterprises were forced to greatly improve the ways in which they handled personal data.</p><h3>The Rise of Artificial Intelligence</h3><p>As Big Data gained ground, advanced analytics became possible. At first this was data scientists building custom models to make predictions, rather like models used in academic research. The data scientists were highly qualified, very intelligent, and quite expensive. Slowly, however, it was realized that the commercial world is not like academic research. No hypotheses needed to be formulated and modeled. Instead, software tools could look for patterns in data that might have business value. And slowly the data scientists were replaced by software tools and developers.</p><p>This was the rise of Machine Learning (ML). It enabled a huge expansion in the advanced analytics that organizations could undertake. The focus shifted to gathering and preparing the data to be fed into the ML tools and making the outputs available to business users. This raised the profile of data even more. Data had to be quickly gathered, understood, and qualified for use.</p><p>As this was happening, AI burst onto the scene with GPT4 in late 2023. A major goal of AI was to actually to replace the legions of developers needed to develop the data pipelines involved in analytics, and thereby expand the use of data. However, AI also utilized the vast amount of unstructured data that had been made available via the Internet. Everybody could now derive much more value from AI analyzing this data than had been possible when only search was available. Yet there were other consequences. Data storage and compute costs, though greatly reduced on a unit basis rapidly expanded on an aggregate basis. In fact, data and its processing are now becoming synonymous with energy. Data has turned the &#8220;New Oil&#8221; metaphor upside down because it is the primary consumer of energy. This cannot be reversed as AI is needed to drive the economy, and data fuels AI.</p><h3>You Are Here</h3><p>Which brings us to 2025 and where we are now. Nearly everything in our economy depends on data. </p><p>No scientific approach could tell us why this is the current reality, because the current reality is the product of a set of historical processes. It was not generated by uniformitarian processes that have always been happening in the same way for all time.</p><p>Is any of this important? It is. The reason is that much of the past remains embedded in the present. Not only hardware and software, but also ideas about how we should manage and use data. The environment we have today is not something that recently came into existence where nothing existed before. Rather, it is a product of the historical development we have outlined. It impacts how we deal with the current state, which is becoming ever more complex. But that is a topic for another day.</p>]]></content:encoded></item><item><title><![CDATA[The Author's Data Journey]]></title><description><![CDATA[Hello! Here's my data story...]]></description><link>https://anchoringdata.com/p/the-authors-data-journey</link><guid isPermaLink="false">https://anchoringdata.com/p/the-authors-data-journey</guid><dc:creator><![CDATA[Malcolm Chisholm]]></dc:creator><pubDate>Tue, 07 Oct 2025 14:02:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ZbuS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe401afa5-1a38-475f-8aac-87fe21971170_1080x1080.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZbuS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe401afa5-1a38-475f-8aac-87fe21971170_1080x1080.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZbuS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe401afa5-1a38-475f-8aac-87fe21971170_1080x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ZbuS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe401afa5-1a38-475f-8aac-87fe21971170_1080x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ZbuS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe401afa5-1a38-475f-8aac-87fe21971170_1080x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ZbuS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe401afa5-1a38-475f-8aac-87fe21971170_1080x1080.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZbuS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe401afa5-1a38-475f-8aac-87fe21971170_1080x1080.jpeg" width="420" height="420" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e401afa5-1a38-475f-8aac-87fe21971170_1080x1080.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1080,&quot;width&quot;:1080,&quot;resizeWidth&quot;:420,&quot;bytes&quot;:169181,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://anchoringdata.com/i/175447807?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe401afa5-1a38-475f-8aac-87fe21971170_1080x1080.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZbuS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe401afa5-1a38-475f-8aac-87fe21971170_1080x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ZbuS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe401afa5-1a38-475f-8aac-87fe21971170_1080x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ZbuS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe401afa5-1a38-475f-8aac-87fe21971170_1080x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ZbuS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe401afa5-1a38-475f-8aac-87fe21971170_1080x1080.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>My name is Malcolm Chisolm and since the 1980&#8217;s I have been involved in data. At first I was a developer, but then I moved to database design, and ultimately to strategic data management. I have seen data transform from being almost irrelevant to one of the most important resources in the economy.</p><p>The part of my career when I was a developer was a great experience. Developers get the freedom to design and create. Obviously, there are economic constraints, and specific goals that developers need to meet, but building something that then works, and which other people rely on is very satisfying. What I also found is that there are methodologies in programming, so it truly is its own specialization, and not just an extension of logic or mathematics, which some people think it is.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Hi! I&#8217;m Malcolm and this is a <strong>new</strong> publication dedicated to providing relevant information on data. I aim to shine a light on knowledge about what data is and how it is managed with each of my posts. Please subscribe and join me in this data journey.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>My programming was always acting on data. Initially, I only thought much about the data to the extent needed to make sure my programming worked. There were a few basic data management practices like backups and restores that I had to do, but not much more. Then I discovered databases and began to design them. Again, specific methodologies came into play, and I realized that data was its own thing.</p><p>There was an opportunity to be creative with data, by designing databases, but gradually I began to realize that data is fundamentally different to other things that matter to us. We live in the Information Age, but we still think and act as if we are in the Industrial Age. As a result, we have many difficulties in interacting with and managing data - because it is not like any industrial good.</p><p>This realization spurred me to get much more involved with data and its problems. I began to write articles about it, speak at conferences, and have now published five books on the subject. It is not as if things are getting any easier. New waves of technology, and innovative uses of data bring questions, needs, and problems for data that were not there before. There is a constant need to deal with this ever-evolving field of human endeavor that is data.</p><p>Until a new Adam Smith comes along and explains it all to us, we are going to have to do the best we can and take incremental steps. That is what this Substack is intended to do. I aim to provide a grounding in data for everyone, not just those with a technical interest.</p><p>I would like to extend an invitation to you to join <strong>Anchoring Data</strong> as a way of getting a practical understanding of data and different topics you will hopefully appreciate. </p><h4>                   Let&#8217;s stay grounded in the ever-changing world of data.</h4><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anchoringdata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Anchoring Data! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>