Spark 3.0 Features | Adaptive Query Execution(AQE) | Part 1 - Optimizing SKEW Joins

Опубликовано: 30 Декабрь 2024
на канале: Tech Island
5,181
128

Data Skewness is handled using Key Salting Technique in spark 2.x versions. In spark 3.0, there is a cool feature to do it automatically using Adaptive query Executions.

One of the biggest problem in parallel computational systems is data skewness. Data Skewness in Spark happens due to joining on a key that is not evenly distributed across the cluster, causing some partitions to be very large and not allowing Spark to process data in parallel.

This feature will address the above issue automatically by enabling the below configuration:
spark.conf.set(“spark.sql.adaptive.enabled”,”true”)



Medium Blog   / spark-3-0-features-demo-data-skewness-aqe  

Handling the Data Skewness using Key Salting Technique for Spark 2.x versions:
   • How to handle Data skewness in Apache...  

Content By - Jeevan Madhur [LinkedIn -   / jeevan-madhur-225a3a86  ]
Editing By - Sivaraman Ravi [LinkedIn -   / sivaraman-ravi-791838114  ]


Facebook Page - https://www.facebook.com/Tech-Island-...

Please SUBSCRIBE to our channel :)

Share your feedback with us.
[email protected]