Anti-Patterns in Data Mesh
Data Management, Data Governance Paul Karsten Data Management, Data Governance Paul Karsten

Anti-Patterns in Data Mesh

This article explores common anti-patterns in implementing Data Mesh, a decentralized data architecture emphasizing domain-oriented data ownership. While Data Mesh aims to enhance data accessibility and usability across organizations, its success relies on understanding core principles: domain-driven data ownership, data products, and federated governance.

Read More
Data Mess to Data Mesh
Data Ops, Data Management Paul Karsten Data Ops, Data Management Paul Karsten

Data Mess to Data Mesh

The standard strategy of centralizing data into a single repository often leads to chaotic "data swamps.” Due to poor data quality and governance issues, these swaps hinder efficient analysis and decision-making. An alternative approach, known as Data Mesh, proposes a decentralized architecture focused on treating data as a product.

Read More
Model Development
Data Science, Data Management Paul Karsten Data Science, Data Management Paul Karsten

Model Development

This blog post outlines the second phase of our Data Science Process: Model Development. Which involves building, training, and evaluating models based on data gathered during Question Formation. The process is iterative, experimenting with different algorithms, features, and parameters in a sandbox environment before scaling to larger datasets. Model performance is evaluated using metrics, validation for overfitting/underfitting, and checks for robustness and interpretability. Finally, models must be versioned, monitored for data drift, and continuously updated to ensure they remain effective and relevant over time.

Read More
Question Formation and Data Analysis in Data Science
Data Science, Data Management Paul Karsten Data Science, Data Management Paul Karsten

Question Formation and Data Analysis in Data Science

This blog post focuses on the first phase of our Data Science Process: Question Formation and Data Analysis. In this phase, we iterate multiple times through question formation, data collection, and exploration. Initial questions are likely to be of low fidelity. Through the process of data exploration, the questions gain fidelity and drive toward business value.

Read More
Your Starter Guide to Data Governance
Data Governance Paul Karsten Data Governance Paul Karsten

Your Starter Guide to Data Governance

Data governance establishes standards for data collection, storage, and analysis, ensuring accuracy and mitigating risks associated with regulatory non-compliance. Moreover, governance promotes ethical data practices, safeguarding individual privacy rights and societal norms.

Read More