Connecting Spark Application with GCS | Google Cloud Storage | Detailed Explanation

Опубликовано: 13 Март 2025
на канале: GK Codelabs
4,333
72

Hello everyone,
In this video I have explained, how can you connect your Apache Spark-Scala application with Google cloud storage, in very simple steps.
Also, we have discussed the GCS Bucket creation, Google managed keys authentication, and GCP service accounts, along with the Spark + SCALA code for GCS connection

Please refer to the below time codes to jump directly to the topics of your interest

Time Codes
=======
00:00 - Introduction
01:00 - Topics to cover
02:31 - Spark Code Structure for GCP Connection
05:22 - Dataset and a test run in the local file system
06:58 - Spark application settings for Google Cloud Storage connection
09:41 - Hadoop configurations to be set in Spark application
12:48 - Bucket creation in GCS
17:29 - Service accounts in GCS and its significance in spark connection
21:02 - Key management for GCS Bucket
21:27 - Generating keys in the GCP service account
23:13 - Using GCS Authentication Keys in Spark configs
24:43 - Uploading Data to GCS
26:07 - Run spark application read operation with GCS config
26:50 - Spark with GCS run validation
27:07 - Spark application write operation validation
29:14 - Google cloud resources termination


To become a GKCodelabs Extended plan member you can check the below links, and purchase the Big Data end to end pipeline course in your preferred language Python or SCALA

PySpark course available at
https://courses.gkcodelabs.com/produc...

Spark + SCALA course available at
https://courses.gkcodelabs.com/produc...


Starter Pack available at just: ₹549 (For Indian Payments) or $9 (For non-Indian payments)
Extended Pack available at just: ₹1299 (For Indian Payments) or $19 (For non-Indian payments)
Queries? Write to us at [email protected]
Website: https://www.gkcodelabs.com