I recently spoke with the head of data and analytics at an investment management firm about how 2025 was going so far for him and his peers in the industry. Maybe not surprisingly, he said it’s been challenging.

Setting aside geopolitics and market conditions, he said, throughout the industry, data engineers and data scientists are facing increasingly complex demands. The convergence of AI, cloud infrastructure, regulation, and real-time analytics has elevated the strategic importance of these roles.

It took a while to whittle down the lengthy list, but here are the top five challenges facing data engineers and scientists…

1. Data Quality and Governance at Scale

The volume and variety of data have grown exponentially, but data quality hasn’t always kept pace. Engineers and scientists spend a disproportionate amount of time cleaning, validating, and reconciling inconsistent or incomplete data. Organizations are grappling with governance at scale — ensuring accuracy, lineage, and compliance across distributed systems, multiple clouds, and hybrid architectures. Automation helps, but accountability and oversight are more critical than ever.

2. Integration of Real-Time and Batch Systems

The demand for real-time insights has risen sharply. Whether it’s for algorithmic trading, risk management, or user personalization, organizations now need systems that can process both streaming and historical data cohesively. Data engineers are tasked with building hybrid pipelines that maintain low latency without sacrificing data integrity, while scientists need models that can adapt to real-time inputs without retraining from scratch.

3. Keeping Up with AI Model Complexity

Large language models (LLMs), multimodal AI, and ensemble learning techniques have introduced powerful new capabilities — but also new operational burdens. Deploying, monitoring, and fine-tuning these models requires deep expertise and scalable infrastructure. The line between data science and ML engineering continues to blur, demanding new skill sets that bridge experimentation with reliable deployment.

4. Ethical Use and Regulatory Compliance

The use of alternative data has become mainstream. Data privacy laws (like GDPR, CCPA, and emerging AI regulations) have become stricter and more complex. Data scientists and engineers must consider data minimization, fairness, bias detection, and explainability from day one. Integrating compliance into data pipelines and ML workflows is no longer optional — it’s a baseline expectation.

5. Talent Gaps and Cross-Functional Collaboration

The rapid evolution of tools, frameworks, and cloud-native platforms makes it difficult for teams to stay current. Moreover, data professionals must collaborate across product, engineering, legal, and executive teams — requiring strong communication skills and domain fluency. In an environment where business success depends on data-driven decisions, bridging technical and strategic goals is a major challenge.

Success in data roles isn’t just about technical skill — it’s about adaptability, accountability, and alignment with business value.

The Takeaway.

By systematically evaluating these characteristics, you can identify flaws, make necessary corrections, and ensure that your data-driven decisions rest on a solid foundation.

What about you? What challenges are your data engineers and/or data scientists currently facing? Please comment – I’d love to hear your thoughts.

Also, please connect with DIH on LinkedIn.

Thanks,
Tom Myers