In Search of the One True Source of Truth: The Highlander Approach to Data Product Management in Enterprise Marketplaces
February 9, 2024
Art Morales, Ph.D.
In a world inundated with data, enterprises are constantly grappling with the challenge of filtering the signal from the noise, of distinguishing trustworthy data from the unreliable. Much like the cult classic film Highlander, where “there can only be one” supreme immortal, the data marketplaces must heed a similar dictum — there must exist a singular, authoritative source for every data entity to ensure the utmost data quality and foster trust. Just as in the Highlander’s world, harboring more than one supreme force creates chaos and dilutes the potency of truth and reliability.
Defining Authoritative Sources
An authoritative source is a pre-eminent source of truth for an entity, recognized for its reliability and trustworthiness. Its identification is a strategic necessity to foster data consistency and avoid duplicative efforts in data processing. In an enterprise data marketplace, the authoritative source acts as the source of truth for that entity. This implies that all cleanup and mastering efforts are done to create the entity and thus the data product provides the master list of records for it. As part of this, the authoritative source should provide an enterprise-wide universal ID that uniquely identifies and references the record across the enterprise.
Enterprise-wide authoritative entities are often needed even if there is an external data source that acts as a reference source. The enterprise knowledge complements and supplements the external data and this may result in changes to the master data, merging and/or splitting records due to the additional knowledge.
Regardless of how they are defined, enterprise entities’ sources of truth ensure the streamlined operation of data systems and trust amongst stakeholders.
Data Entities and Marketplaces
Data entities are distinct types of data, structured to define knowledge about a concept vital for an organization’s operation. Examples of data entities include: customer, company, person, physician, investigator, drug and investigational compound. Entities can also be related to each other such as a person working at a company or a physician prescribing a drug (stay tuned for future blogs on why and how graph databases exponentially make data more useful and accessible through these relationships).
On the other hand, a data marketplace is a structured environment where data providers and consumers come together to exchange data and knowledge. These marketplaces can house multiple similar data products, often from various vendors and suppliers, offering a rich inventory for data consumers.
In such a rich and diverse ecosystem, the doctrine of “there can only be one” authoritative source becomes both a guiding principle and a stringent requirement, ensuring the sanctity of data entities and safeguarding the marketplace from the perils of misinformation and low-quality data. A data marketplace thus must clearly identify and highlight which products are authoritative for which entities.
The Necessity of the Highlander Approach
The significance of embracing a Highlander approach — a one true source methodology — goes far beyond ensuring the accuracy of data. Let’s delve deeper to understand the myriad advantages it unfurls:
1. Data Consistency: A single authoritative source prevents conflicting information from permeating through the data ecosystem, guaranteeing a coherent and consistent data narrative.
2. Improved Trust: Knowing that there exists a definitive source of truth fosters trust among data consumers, enhancing the reliance on data for critical decision-making processes.
3. Cost-Efficiency: Eliminating the need to validate data from multiple sources, enterprises can streamline operations, saving time and reducing costs.
Identifying the “One”
But how does one identify this authoritative source? This task requires meticulousness, with a deep understanding of the data landscape, domain and a well-articulated data strategy.
· Credibility: Establishing the historical reliability of a data source is paramount. The source must have a record of providing accurate, dependable data.
· Comprehensiveness: The source should offer a comprehensive dataset, leaving no gaps in the information provided.
· Timeliness: An authoritative source updates regularly, reflecting the most current state of information.
· Compliance: Ensuring adherence to regulatory norms and industry standards is non-negotiable.
Implementing the Highlander Principle
Implementing the Highlander approach involves engraving the “there can only be one” philosophy in the foundational strategies of data management and Data Marketplaces:
· Source Authentication: Rigorous verification processes must be in place to authenticate the sources and validate their authority.
· Data Stewardship: Appointing data stewards who oversee the quality and reliability of data becomes essential in maintaining a golden standard.
· Effective Mastering: Mastering data from different sources is not just about merging the obvious matches. One must implement a process that identifies matching records effectively and can scale as data volume grows. Next generation data mastering tools that use machine learning for the process such as Tamr can provide significant value over traditional methods.
· Feedback Loop: Establishing a feedback mechanism that allows users to report inconsistencies, thus enabling continuous improvement. This is particularly useful when mastering uses machine learning since the model can learn from the feedback and decrease reliance of hard and fast rules to match records.
· Collaborative Effort: Encouraging collaboration between data providers and consumers to foster a community that works towards maintaining the integrity of the data marketplace.
· Findability: Data Marketplaces must understand the concepts and properly identify and direct data consumers to authoritative sources for each entity.
Looking Ahead
As we look ahead, the data marketplaces stand at a juncture teeming with opportunities and challenges. The exponential growth in data generation mandates a methodical approach to data management, where the authoritative source reigns supreme, ensuring quality and fostering trust.
An enterprise embarking on the journey of establishing a data marketplace must be prepared to wield the sword with skill and precision, in true Highlander spirit, to claim and maintain the authoritative source, championing data quality and establishing an unshakeable trust in its marketplace.
The quest for data reliability in an enterprise data marketplace is a call to embrace the Highlander approach to data management; an acknowledgment that in the grand arena of data, “there can only be one” authoritative source. Let this be the guiding principle, the mantra that steers enterprises towards a future of unrivaled data quality and uncompromised trust, fostering an ecosystem where data is not just an asset but a beacon of truth, guiding us in a world steered by data-driven insights. It is time to raise the sword of authenticity and carve out a realm of reliable and trustworthy data landscapes. Let us march forward, with the conviction of the Highlander, forging paths of credibility in the intricate web of data marketplaces.
At XponentL Data, we help enterprises define not only their data strategies but also help implement data products and data marketplaces, let us help you in your journey to the one source of data truth.