While Parquet column encryption is broadly adopted by the industry, there are use cases that require finer grained control than column encryption, including varying privacy, access control, and retention policies at the cell level. Cell encryption for Apache Parquet is designed to give organizations more granular access control.
In this talk, we will share the challenges, the design, the involvement of the open source community, and the progress towards adding a powerful tool for applying arbitrary policies on cells. We will deep dive into this new feature and how it works under the hood. We will also present performance and space overhead ,and how we implemented masking semantics to enable crypto-shredding of cell encrypted data.