Apache Kafka with Spark streaming and HBase Integration in scala | Streaming data pipeline

Опубликовано: 05 Октябрь 2024
на канале: GK Codelabs
7,204
89

Hello Guys,
In this video i have created a a big data pipeline where we are taking the live inputs from a Kafka producer into a Spark Streaming application, processing the same and storing the output to HBase.
This is a simple small project that could help you undrstand the Streaming APIs integration.
Below are the other relevant videos from my channel:

Spark Streaming with kafka
---------------------------------------------------
   • Spark streaming with KAFKA | Complete...  

Installation of kafka on Cloudera quickstart VM
----------------------------------------------------------------------------
   • Installing Apache Kafka in Cloudera Q...  

CLI Commands used in the video:
------------------------------------------------------------
Starting the producer
/usr/lib/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic kafkatutorial

Creating the Hbase table - create 'Streaming_wordcount', 'Word_count', 'Occurances'