In this video, I share with you about Apache Spark using Scala. We'll walk through a quick demo on Azure Synapse Analytics, an integrated platform for analytics within Microsoft Azure cloud. This short demo is meant for those who are curious about Spark with Scala or just want to get a peek at Spark in Azure Synapse. If you are new to Apache Spark, just know that it is a popular framework for data engineers that can be run in a variety of environments. It is popular because it enables distributed data processing with a relatively simple API. If you want to see examples in Python or C#, you can check out one of my other videos where I walk through a similar demo.
You can follow along to build a Spark data load that reads linked sample data, transforms data, joins to a lookup table, and saves as a Delta Lake file to your Azure Data Lake Storage Gen2 account. Please be aware that you will occur costs following this example. To keep costs minimal make the Spark pool small and keep default 15 minute auto-terminate setting.
Related Article: https://dustinvannoy.com/2021/02/03/a...
Code: https://github.com/datakickstart/syna...
C# demo: • Azure Synapse Spark .NET (C#)
Python demo: *Coming soon
More from Dustin:
Website: dustinvannoy.com
Twitter: @dustinvannoy
Github: https://github.com/datakickstart