I recently had lunch with the founder of a business analytics company whom I mentor. Over some very tasty interior Mexican food, he vented about how his clients do not fully understand how hard it is to get data right every day.
If I got a dollar every time I heard this, I’d be having surf & turf for lunch every day of the week!
From the outside, a data team might look like a group of people running queries and pushing dashboards. Inside, the reality is a daily juggling act of processes, priorities, and firefighting. For data engineers, scientists, analysts, and chief data officers, the real challenges are often invisible to the rest of the organization.
One of the most overlooked struggles is process fragility. Many data workflows depend on a patchwork of scripts, scheduled jobs, and integrations that grew organically over time. When one component changes (e.g. a new API version, a schema update, etc.), teams scramble to adjust upstream and downstream dependencies. This is why robust process documentation and automated dependency tracking are critical. Without them, a single unnoticed change can cascade into days of missed deliverables.
Data intake is another hidden pain point. Stakeholders often assume new data sources can be onboarded quickly, but ingestion requires far more than connecting a feed. Teams must validate data formats, run profiling to uncover inconsistencies, and set up monitoring for volume or pattern changes. Skipping these steps leads to faulty analytics that erode trust. Building a repeatable intake checklist is not just best practice; it’s insurance against costly errors.
For data scientists, model deployment is frequently bottlenecked by the lack of standardized handoff procedures. Even a well-performing model can sit idle if it cannot be integrated into production pipelines. Establishing shared deployment templates between engineering and science teams shortens this gap. These templates should specify environment requirements, feature preprocessing steps, and monitoring metrics for drift.
Chief data officers face a different kind of struggle: aligning strategic priorities with operational realities. When project intake is not filtered through clear scoring criteria – based on business impact, technical feasibility, and data availability – teams end up overcommitted. Implementing a project scoring matrix ensures that high-value work gets done while lower-impact initiatives are deprioritized.
Cost efficiency is another area where invisible struggles arise. Query optimization, storage tiering, and scheduled data archiving are rarely exciting topics, but they are essential for keeping budgets under control. Setting up automated query cost reports and regularly pruning unused datasets can cut expenses without reducing capability.
The Takeaway.
The most effective data teams treat these challenges not as one-off problems but as signals to refine processes. By embedding automated checks, repeatable onboarding procedures, and decision-making frameworks, they create a resilient operational core. This foundation frees them to focus less on fixing breakages and more on delivering insights that improve business efficiency.
Not to mention gives them more time to enjoy lunch!
What about you? What struggles do you or your data teams face? Please comment – I’d love to hear your thoughts.
Also, please connect with DIH on LinkedIn.
Thanks,
Tom Myers