Don’t Stop Believin’ in Your Pipelines: Data Quality with Great Expectations and AirflowLearn to integrate Great Expectations with Airflow to automate data quality checks, streamline your pipelines, and ensure trustworthy data.Jan 13Jan 13
AWS S3 Tables: Game-Changer or Contender in the Lakehouse Wars?AWS S3 Tables are here — game-changer or just another player in the Lakehouse wars? Dive into the tech, hype, and what it means for…Dec 28, 2024Dec 28, 2024
Shrink, Sort, and Speed Up: Turning 55GB into 200MB in Cloud Data LakesWorking with massive datasets in a cloud data lake can be challenging and rewarding when done efficiently. In this blog, I’ll share some of…Dec 13, 2024Dec 13, 2024
Migrate From Hive to Iceberg: Transform Your Data Lake into a Future-Ready LakehouseDec 3, 2024Dec 3, 2024
One Table, Two Engines: Building a Unified Lakehouse with Spark, Trino, and Apache IcebergIn today’s hybrid data world, interoperability is no longer a luxury — it’s a necessity. This blog dives into how you can seamlessly…Nov 19, 2024Nov 19, 2024
Introduction to Privacy-Preserving Process Mining (part 2)In this blog post, we continue describing the fundamentals of privacy-preserving process mining. Part 1 of this tutorial discussed the…Nov 14, 2024Nov 14, 2024
Introduction to Privacy-Preserving Process Mining (Part 1)Data privacy has become trending, and nowadays, people are talking more and more about it. With process mining, privacy becomes more…Nov 14, 2024Nov 14, 2024
Why Apache Iceberg Is the Key to Future-Proofing Your Data StrategyUnlock scalable, reliable data strategies with Apache Iceberg — your key to modern, efficient data lakes.Nov 14, 2024Nov 14, 2024