Building Real Time BI Systems with Kafka, Spark & Kudu: Spark Summit East talk by Ruhollah Farchtchi

Опубликовано: 28 Сентябрь 2024
на канале: Spark Summit

13,876

98

One of the key challenges in working with real-time and streaming data is that the data format for capturing data is not necessarily the optimal format for ad hoc analytic queries. For example, Avro is a convenient and popular serialization service that is great for initially bringing data into HDFS. Avro has native integration with Flume and other tools that make it a good choice for landing data in Hadoop. But columnar file formats, such as Parquet and ORC, are much better optimized for ad hoc queries that aggregate over large number of similar rows.

Юлиана Савилова песня Игра в оркестр. Исполняет Лиза Васина. 05.11.2017.

Юлиана Савилова песня Игра в оркестр. Исполняет Лиза Васина. 05.11.2017.

Gesture Recognition using Python (OpenCV Library)

Gesture Recognition using Python (OpenCV Library)

Мебель для куклы: Спальня принцессы Gloria / Princess bedroom Gloria doll furniture

Мебель для куклы: Спальня принцессы Gloria / Princess bedroom Gloria doll furniture

Istri Sah Bongkar di Balik Pernikahan Sedarah, Suami Telah Hamili Adik Kandung - BIS 04/07

Istri Sah Bongkar di Balik Pernikahan Sedarah, Suami Telah Hamili Adik Kandung - BIS 04/07

HALF LIFE Opposing Force - Full Game Walkthrough No Commentary | ХАЛФ ЛАЙФ - Полное Прохождение

HALF LIFE Opposing Force - Full Game Walkthrough No Commentary | ХАЛФ ЛАЙФ - Полное Прохождение

The Jacksonville Jaguars O-Line Could Derail It All

The Jacksonville Jaguars O-Line Could Derail It All

FUCK HARD (SLUT BANGED HARD IN THE POOL)

FUCK HARD (SLUT BANGED HARD IN THE POOL)

#shorts