The Trouble with Tribbles (Data Products without Domain Owners)

October 16, 2023

Art Morales, Ph.D.

In the classic Star Trek episode “The Trouble with Tribbles,” we witness the fallout from an unmanaged explosion of cute yet exponentially multiplying creatures known as tribbles. This episode serves as a vibrant metaphor for the potential chaos that can ensue when data products lack clear domain ownership and control.

Understanding the Distinction

To maintain a thriving ecosystem of data management, it is essential to differentiate between the guardians of data — the data stewards, and the masters of its direction — the data product owners. Drawing a clear line between these roles ensures the symbiosis between technical expertise and business acumen.

The Captain’s Chair: The Role of a Data Owner

Data owners, on the other hand, stand at the helm, steering the data products in alignment with the business objectives. They maintain a deep understanding of the business domain, bridging the technical and business worlds to steer the organization’s data products in the right direction.

Centralized Stewardship: A Hub of Innovation and Governance

Centralizing the realms of data stewardship, engineering, and analytics provides many benefits, shaping a unified force that streamlines the technical resources across the enterprise. This central hub serves as a beacon of innovation, drawing talent into a collaborative pool where ideas flourish, and expertise is honed to a finer point.

One of the substantial benefits lies in averting the emergence of shadow IT or shadow data science organizations within different domains. These shadow entities, albeit created with good intentions to facilitate quicker solutions, often drift from the broader technology goals of the organization, operating in silos that hinder cohesive growth. Centralized stewardship staunchly counters this, ensuring alignment with the organizational tech objectives, fostering an environment that operates in unison rather than disparate fragments working in isolation.

Furthermore, centralization facilitates robust data governance, instilling a controlled environment that ensures the quality and consistency of data. It creates a framework where policies, procedures, and standards govern the data’s life cycle. This results in trustworthy data products, free from duplications and inconsistencies, engendering a reliable foundation for insightful analytics and informed decision-making.

Beyond governance, centralization promotes control, creating a singular channel that guides data management efficiently. It streamlines the process, eliminating redundancies, and ensuring resources are judiciously utilized, preventing wastage, and promoting efficiency. By housing data science and analytics teams under one roof, this approach can foster an ecosystem that encourages knowledge sharing and collective growth, enhancing the team’s proficiency and nurturing a culture grounded in collaborative innovation.

Moreover, a centralized approach bolsters security measures, creating an environment where data is safeguarded with stringent processes, minimizing the risks of breaches, and ensuring adherence to regulatory compliances. It crafts a secure haven for data as a fundamental principle driving operations.

By championing a centralized stewardship model, organizations pave the way for a harmonious symbiosis between technical prowess and business acumen, cultivating a ground ripe for innovation while steering clear of the tribulations that shadow operations can unleash. It forms the bedrock of a data-driven organization, where reliability, control, and innovation are not just buzzwords but the guiding philosophies shaping a successful data management strategy.

Now that we’ve gone over what we’ve been doing for 20 years… let’s address the concepts of data mesh and federation and why they are important.

The Emergence of Data Mesh and Domain Leadership

As organizations evolve, expanding in complexity and reach, the centralized model sometimes strains at the seams, unable to accommodate the dynamic needs and nuanced structures that sprout within various domains. This is a driving reason for the data mesh concept, advocating for a decentralized approach grounded in the robust leadership of domain experts holding the reins as product owners of record.

In the ecosystem of a data mesh, domain leaders have pivotal roles, steering the course of data products. They stand at the unique intersection where technical prowess meets business insight, offering a vantage point rooted deeply in the realities and nuances of their specific domain. This approach not only facilitates decision-making that is sharply attuned to business objectives but fosters an environment where data product development is intrinsically woven with in-depth business knowledge.

As organizations grow, centralization of all technical resources evolves from a sought-after goal to an impractical ideal. This reality nudges domains to forge purpose-built groups endowed with a dual proficiency — a deep understanding of the technical landscape paired with substantial domain literacy. These groups often assume the dual responsibility of owning and steering data products, merging technical expertise with nuanced understanding of the domain’s specific needs and dynamics.

The role of a data owner, in this context, is more complicated. While remaining steadfastly aligned with the overarching business goals remains a central duty, data product owners find themselves navigating an expanded territory where their technical choices bear significant weight. The decisions taken must echo the broader enterprise objectives and align with the established infrastructure, thus promoting growth that mirrors the enterprise’s technical roadmaps.

These domain-specialized groups function as hubs of expertise, taking the helm of data products and guiding them on a trajectory finely tuned to both business objectives and enterprise goals. Their decisions, driven by a dual understanding of domain intricacies and technical frameworks, foster products that are not just technically sound but resonate deeply with the domain’s core objectives.

In carving a path where technical choices align seamlessly with enterprise goals and infrastructures, a data owner ensures a holistic development that respects the domain’s specificities while aligning with broader organizational vision. Thus, they forge a pathway where domain expertise meets enterprise foresight, guiding the development of data products that are both robust and acutely aligned with the overarching goals, fostering a landscape of synchronized growth and integrated success.

Don’t break the Prime Directive, Unless You Have To

Just like in Star Trek, where the Prime Directive was supposed to be an unbreakable directive that was often broken, when necessary, the best laid out plans often have to change when the hammer of reality comes down. In every organization, especially large ones, some assets may be vital to more than one domain. Examples of these may be Customers, Partners, Vendors, Drugs, etc. In those cases, these entities may be considered Enterprise Entities (or Products) and require additional governance. These Enterprise Assets may arise from multiple core products at the domain level or may be assembled de novo but an enterprise data team. As they are often consumed and relied upon by multiple groups, their governance needs to include representation from parties across the enterprise.

It is tempting to err on the side of calling every data product that crosses domains or that is important to an organization an Enterprise Product, but this can lead to unnecessary complexity. Every product needs to be evaluated with a bias towards ownership by a domain and Enterprise Products should be an exception and not the norm.

From Academic Freedom to Corporate Responsibility

In my journey, alternating between the roles of a scientist and a technophile, I had the leeway to play both steward and owner, fostering innovation often without the pressing demands for a clear Return on Investment (ROI).

However, in the corporate there is often the need for a more structured approach, aligning technical endeavors seamlessly with business goals, shifting from a goal of innovation to a structured process targeting specific objectives with financial accountability and need for a clear ROI.

Quite often, and due to limited involvement/interest from the business owners, I was left “holding the bag” and having to make decisions about the products where I had to interpret very loose or conflicting guidelines from the business to properly guide the technical efforts. In my case, I was lucky enough to be working on domains where I understood and had “lived” the science. Thus, I often (but not always) guessed correctly. Nevertheless, we sometimes found ourselves with “the most amazing data product that no one cared about”. We lightly referred to these cases as the “best kept secrets in the industry”, where we always said: “if we could only get the business’ buy-in, we would transform the company!”. In retrospect, some of these were the proverbial “tribbles”, cute little entities that didn’t help the organization but consumed resources.

To be fair, we had wins where we were able to convince the business of the value of the product, but we also had false-starts and reprioritizations on occasion. Thankfully we won more than we lost, but life would have been much easier if the business was more involved on a regular basis. It is also important to highlight that we were always willing to give up control once we found the right party to take over, but that was easier said than done and the business often relied on us for their direction creating a conundrum that took longer to address than the development of the data products themselves.

Navigating Challenges: Data Literacy, Time, and Communication

The concept of the data mesh requires data product owners and thus business leaders to possess a sound data literacy to steer decisions effectively. The lack of time and/or bandwidth to focus on the problem, a perpetual constraint, further complicates this dynamic, necessitating a fluid delegation mechanism.

Business leaders, in instances of limited data/technical literacy or time/bandwidth, have the option to delegate ownership roles. This delegation, however, is not an abdication. It comes with the underlying imperative of retaining ultimate accountability. It fosters a system where open lines of communication are non-negotiable, ensuring a continuous feedback loop and nurturing a relationship built on trust and mutual respect.

As we traverse the complicated landscape of data management, we find that a fine balance between centralized stewardship and decentralized ownership emerges as the linchpin for success. The lessons echoed from the Star Trek tribbles episode resonate profoundly, underscoring the imperative of well-defined roles and control to prevent chaos and ensure a harmonious trajectory.

By endorsing a model where data stewards and data owners work in tandem, each playing to their strengths while maintaining open communication channels, organizations can steer clear of ‘tribble troubles’. This collaborative approach assures a voyage where innovation meets responsibility, guiding the starship of our organizations to a successful journey in the vast universe of data management.