Home › Knowledge Base › Data Lakehouse

Data Lakehouse

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

"Skill issues'': data-centric optimization of lakehouse agents

arXiv:2606.01185v1 Announce Type: new Abstract: Coding agents are becoming users of data infrastructure, but their success depends not only on model quality: it also depends on the skills and environment files that teach agents how to use a system. We study how to optimize these artifacts for agents operating on a branching lakehouse, Bauplan. In our setting, headless APIs and Git-like data primitives expose data workflows through code, branches, commits, and merges.

arXiv CS 8d ago

Predicting Lakehouse Performance in Clouds: An Empirical Exploration of Query Runtime Variance

Announce Type: new Abstract: Data analytics increasingly runs on distributed lakehouse systems, where platform operators must optimise monetary, resource, and environmental costs. Query Performance Prediction (QPP) helps to balance these costs and supports workload management techniques, such as adaptive resource scaling and low-carbon scheduling.

arXiv CS 7d ago

Data Architectures and their Technical Requirements (DATER)

arXiv:2606.08811v1 Announce Type: new Abstract: Modern organizations generate and consume massive volumes of heterogeneous data at high speed. This requires a continuous development of new techniques for more efficient and reliable data management. Designing appropriate data architectures has therefore become a strategic necessity, as they shape how data is integrated, governed, and made available for analytics and decisionmaking.

arXiv CS 1d ago

What Went Wrong with Data Lakes? A 15-Year Reality Check from the Field

Announce Type: new Abstract: James Dixon introduced the Data Lake in 2010. The pitch was simple: store data raw, postpone schema, cut up-front transformation. It promised flexibility and easier analytics.

arXiv CS 1d ago

AWS whips out Graviton-powered Redshift instances, claims 7x speed for data warehouse

AWS has introduced Graviton-powered Redshift RG instances, claiming they can accelerate new query workloads by up to seven times. These instances offer significant performance improvements and cost efficiencies compared to previous generations, enabling Redshift to better handle increasing demands from AI agent workloads. The updated engine also allows users to run SQL analytics across both data warehouses and data lakes from a single platform.

The Register 14d ago

Architectural Evolution and Selection Framework for Database Systems in AI-Ready Data Platforms

arXiv:2606.08317v1 Announce Type: new Abstract: The rise of polyglot data management and AI-ready database architectures has created a complex design space across diverse database paradigms. However, architecture selection in modern enterprise environments continues to rely heavily on ad-hoc engineering intuition, with limited systematic frameworks to guide decision-making across heterogeneous database systems.

arXiv CS 1d ago