Tangram is a state-of-art resource allocator and distributed scheduling framework for Spark at Facebook with hierarchical queues and a resource based container abstraction. We support scheduling and resource management for a significant portion of Facebook's data warehouse and machine learning workloads that equates to running millions of jobs across several clusters with tens of thousands of machines. In this talk, we will describe Tangram's architecture, discuss Facebook's need for a custom scheduler, and explain how Tangram schedules Spark workloads at scale. We will specifically focus on several important features around improving Spark's efficiency, usability and reliability: 1. IO-rebalancer (Tetris) Support 2. User-Fairness Queueing 3. Heuristic-Based Backfill Scheduling Optimizations.
About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unifie...
Connect with us:
Website: https://databricks.com
Facebook: / databricksinc
Twitter: / databricks
LinkedIn: / databricks
Instagram: / databricksinc Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-nam...