Skip to content

Commit

Permalink
v0.1.0-release
Browse files Browse the repository at this point in the history
* readme changed

* event-bridge rule script updated

---------

Co-authored-by: Michael Greenshtein <greensht@amazon.com>
  • Loading branch information
moryachok and Michael Greenshtein authored May 11, 2024
1 parent bf35689 commit b8d0fd9
Showing 1 changed file with 10 additions and 4 deletions.
14 changes: 10 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
## Monitoring Apache Iceberg Table metadata layer using AWS Lambda, AWS Glue and AWS CloudWatch

This repository provides you with sample code that collects metrics of existing Apache Iceberg tables managed in your Amazon S3 and catalogued to AWS Glue Data Catalog. The code consists of AWS Lambda deployment package that collects and submits metrics into AWS CloudWatch. Repository also includes helper script for deploying CloudWatch monitoring dashboard to visualize collected metrics.
This repository provides you with a sample solution that collects metrics of existing Apache Iceberg tables managed in your Amazon S3 and catalogued to AWS Glue Data Catalog. The solution consists of AWS Lambda deployment package that collects and submits metrics into AWS CloudWatch. Repository also includes helper script for deploying CloudWatch monitoring dashboard to visualize collected metrics.

### Table of Contents
- [Solution Tenets](#solution-tenets)
- [Technical implementation](#technical-implementation)
- [Metrics collected](#metrics-collected)
- [Setup](#setup)
Expand All @@ -14,7 +15,10 @@ This repository provides you with sample code that collects metrics of existing
- [Security](#security)
- [License](#license)


### Solution Tenets
* Solution is designed to provide time-series metrics for Apache Iceberg to monitor Apache Iceberg tables over-time to recognize trends and anomalies.
* Solution is designed to be lightweight and collect metrics exclusively from Apache Iceberg metadata layer without scanning the data layer hense without the need for heavy compute capacity.
* In the future we strive to reduce the dependency on AWS Glue in favor of using AWS Lambda compute when required features are available in [PyIceberg](https://py.iceberg.apache.org) library.

### Technical implementation

Expand Down Expand Up @@ -126,8 +130,9 @@ import boto3
import json

# Initialize a boto3 client
lambda_client = boto3.client('lambda')
events_client = boto3.client('events')
session = boto3.Session(region_name='<<SET CORRECT AWS REGION>>')
lambda_client = session.client('lambda')
events_client = session.client('events')

# Parameters
lambda_function_arn = '<<REPLACE WITH LAMBDA FUNCTION ARN>>'
Expand Down Expand Up @@ -246,6 +251,7 @@ https://docs.docker.com/get-docker/

1. Delete AWS Lambda `sam delete`.
2. Delete CloudWatch Dashboard.
3. Delete EventBridge rule.

## Security

Expand Down

0 comments on commit b8d0fd9

Please sign in to comment.