Advancing Spark - Identity Columns in Delta

Опубликовано: 22 Октябрь 2024
на канале: Advancing Analytics
9,551
185

A classic challenge in Data Warehousing is getting your surrogate key patterns right - but without the same tooling, how do we achieve it in a Lakehouse Environment? We've had several patterns in the past, each with their drawbacks, but now we've got a brand new IDENTITY column type... so how does it size up?

In this video Simon does a quick recap of the existing surrogate key methods within Spark-based ETL processes, before looking through the new Delta Identity functionality!

As always, if you're beginning your lakehouse journey, or need an expert eye to guide you on your way, you can always get in touch with Advancing Analytics.

00:00 - Hello
01:37 - Existing Key Methods
10:36 - New Identity Functionality
15:18 - Testing a larger insert