A Detailed Walkthrough : How to Create Catalog information including Database n Table in AWS Athena

Опубликовано: 22 Март 2025
на канале: Analytica Learning

AWS Athena is a serverless query service that allows you to analyze data stored in Amazon S3 using SQL queries. It is particularly useful for ad-hoc querying and interactive analysis of data in a data lake or data warehouse on S3.
Athena does not require you to load or transform data before querying. Instead, it works directly on the data stored in S3, and the queries are executed on an on-demand basis, with pricing based on the amount of data scanned.
You can use standard SQL to run queries on semi-structured or structured data formats like JSON, Parquet, ORC, and more.

AWS Glue:
AWS Glue is an ETL (Extract, Transform, Load) service provided by AWS. It is designed for data preparation and transformation tasks, making it easier to prepare data for analysis.
Glue provides a managed and serverless environment for performing ETL operations. It automates many of the tasks involved in data transformation, such as data extraction, cleaning, and enrichment.
You define ETL jobs in AWS Glue using a visual interface or code, and it can handle both batch and real-time data processing.
Glue allows you to catalog and store metadata about your data sources, making it easier to discover and understand your data assets.
AWS Athena:

In this video, we delve into the functionalities of AWS Athena and its relationship with AWS Glue. The session includes the creation of catalog information encompassing databases and tables, enabling seamless SQL commands within Athena. Lastly, a detailed walkthrough of the 'Create Table' command is provided, emphasizing the establishment of metadata in AWS Glue.