Spark Dataframe
#bigdata #spark #apachespark #sparkdataframe #dataanalytics #dataengineering #bigdataonline
In this video we will learn the below items:
1. Creating Spark session.
2. Creating Spark dataframe from CSV using Spark session.
3. Setting log level for Spark session.
4. Fetching all the column name in the dataframe.
5. Fetching the total number of rows in the dataframe.
6. Caching the dataframe.
7. Validating the cached dataframe.
8. Fetching top 10 rows.
9. Creating new data frame with only specific columns.
10. Fetching only distinct values.
11. Creating temp view.
12. Retrieving data from temp view.
Please find the required documents/scripts in the below github URL:
https://github.com/skiganesh/dataanal...
References:
https://spark.apache.org/docs/latest/...
https://www.jetbrains.com/pycharm/dow...
https://towardsdatascience.com/best-p...
https://www.jetbrains.com/help/pychar...
https://kaizen.itversity.com/setup-sp...