Talk 1: Deep Dive into Apache Beam (Tyler Akidau and Reuven Lax, Google)
In this talk, Tyler will describe the architecture of Apache Beam and Google Cloud Dataflow - a well-designed, high-performance streaming engine.
Apache Beam is the set of open source SDKs for writing pipelines, and you can then run these Beam pipelines on any platform with a supported Runner (currently: Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow).
Cloud Dataflow is Google's closed-source execution engine provided as a managed service on Google Cloud for running Beam pipelines.
The goal of this talk is to understand the unique patterns, design choices, and trade-offs that make Apache Beam and Cloud Dataflow compelling options.
Bio
Tyler Akidau and Reuven Lax are Software Engineers at Google. Tyler is focused on the Apache Beam streaming programming model. Reuven is focused on the Google Cloud Dataflow streaming execution engine.
Talk 2: Deep Dive into Flink Streaming (Jamie Grier, Data Artisans)
In this talk, Jamie will describe the architecture of Flink Streaming - a well-designed, high-performance streaming engine.
While some comparisons will be made to Spark Streaming, this talk is not intended to convince people to switch to Flink Streaming.
The goal of this talk is to understand the unique patterns, design choices, and trade-offs that make Flink Streaming a compelling option.
Speaker Bio
Jamie Grier (based in San Francisco) is Director of Application Development at Data Artisans (based in Berlin).
Jamie also serves as Developer Advocate, Solution Architect, Sales Engineer, and many other roles required by a startup!
Jamie's wife recently had a baby, so please congratulate him!
Talk 3: Incremental, Online, Continuous, and Parallel Training and Serving of Spark ML and TensorFlow Models with Kafka, Docker, and Kubernetes (Chris Fregly, PipelineIO)
The goal of this talk is to build and demo a continuous-delivery, Spark ML and TensorFlow Model training and serving pipeline running in parallel using Kafka with Docker, Kubernetes, and Netflix Open Source.
Speaker Bio
Chris Fregly is a Research Scientist at PipelineIO - a Streaming Analytics and Machine Learning Startup in San Francisco.
Chris is also an Apache Spark Contributor, Netflix Open Source Committer, Founder of the Global Advanced Spark and TensorFlow Meetup, and Creator of the upcoming O'Reilly Video Series on Deploying and Scaling Tensorflow Distributed and Tensorflow Serving in Production.
Previously, Chris was a Distributed Systems Engineer at Databricks and Netflix - as well as a founding member of the IBM Spark Technology Center in San Francisco.