Automating Publisher Reporting on AWS: A Serverless Architecture with Slack Alerts

Gurudev Prasad Teketi · 2025-09-03T05:12:29+0100

Overview

In this article, I walk you through building an automated reporting pipeline using AWS services. The goal was to generate daily summary reports on publisher readership, detect missing metadata (like publishers), store the results as CSVs in S3, and deliver structured Slack notifications to internal stakeholders — all without manual intervention.

Architecture Diagram

Workflow:
1.Amazon EventBridge (Scheduler) triggers the workflow daily.

2.Lambda Function #1: Report Generator

Runs a named query on Amazon Athena that calculates reading activity data.
Stores results in an S3 bucket using a structured naming convention.
Sends a Slack message once the report is ready.

3.Lambda Function #2: Publisher Summary Aggregator

Fetches the latest CSV report from S3.
Aggregates per-publisher read counts by quarter.
Posts a clean, readable table to Slack showing publisher performance.

4.Lambda Function #3: Missing Publisher Detector (optional)

Runs a separate Athena query to find books with missing publisher info.
Sends a notification to Slack with a direct link to the generated file in S3.

Project Structure

Code:

publisher-reporting/
│
├── deploy.sh                         # Infra provisioning and Lambda packaging
├── config.yaml                       # Runtime config (bucket names, cron schedules)
│
├── lambda/
│   ├── report_generator/
│   │   └── handler.py                # Runs Athena query and stores report in S3
│   ├── summary_report_notifier/
│   │   └── handler.py                # Aggregates data and posts to Slack
│   └── missing_publisher_report/
│       └── handler.py                # Detects and reports missing metadata
│
└── terraform/                        # Infra as Code (Lambda, EventBridge, IAM, etc.)

Code Snippets

Lambda: Report Generator

Code:

exec_response = athena.start_query_execution(
    QueryString=query_string,
    QueryExecutionContext={"Database": ATHENA_DATABASE},
    ResultConfiguration={
        "OutputLocation": f"s3://{ATHENA_OUTPUT_BUCKET}/temporary-athena-query-results/"
    }
)
...
s3.copy_object(
    Bucket=TARGET_REPORT_BUCKET,
    CopySource={"Bucket": source_bucket, "Key": source_key},
    Key=final_key
)

Lambda: Slack Summary Formatter

Code:

for row in reader:
    if row["book_read_counts"].strip().upper() == "TRUE":
        yq = row["year_quarter"].strip()
        pub = row["publisher"].strip() or "Unknown"
        counts[yq][pub] += 1

Scheduling

Code:

+------------------------------+------------------------------------------+-------------------------+
| Lambda Function              | Purpose                                  | Schedule (Cron Format)  |
+------------------------------+------------------------------------------+-------------------------+
| report_generator             | Run Athena query and save CSV to S3      | cron(0 0 * * ? *)       |
| summary_report_notifier      | Read latest CSV and post Slack summary   | cron(10 0 * * ? *)      |
| missing_publisher_report     | Detect books without publisher info      | cron(15 0 * * ? *)      |
+------------------------------+------------------------------------------+-------------------------+

# Notes:
# - Times are in UTC
# - 10–15 min stagger prevents overlap and race conditions

Challenges & Fixes

1.Slack showed same data daily

Issue:
Slack message was posting the same publisher data every day.

Fix:

Code:

EventBridge schedule was set to run only on the 1st of each month.
Updated to run daily using: cron(0 0 * * ? *)

2.No new publisher summary files after Aug 1

Issue:
S3 bucket had no updated files after August 1st.

Fix:

Code:

Found that the publisher report generator Lambda was not running.
Corrected the EventBridge schedule to trigger daily.

3.Lambda race condition

Issue:
The Slack posting Lambda was sometimes reading an older CSV file instead of the one just generated.

Fix:

Code:

Introduced a 10-minute delay between the generator Lambda and the Slack reporter Lambda using separate schedules.

4.Slack output was hard to read

Issue:
The publisher read counts in Slack were misaligned and difficult to follow.

Fix:

Code:

Formatted the message using Slack-compatible triple backticks (```

...

```) to show preformatted blocks.

5.S3 bucket getting cluttered

Issue:
Temporary Athena result files were crowding the output bucket.

Fix:

Code:

Moved results to a 'temporary-athena-query-results/' folder prefix
Added a lifecycle policy to auto-delete them after 3 days.

Impact

Eliminated all manual report generation
Improved team visibility into reader engagement
Ensured scalable, serverless infrastructure using AWS best practices
Automated alerts improved issue tracking and data consistency

This project was a great example of combining Athena, Lambda, S3, and EventBridge into a cost-efficient, automated reporting pipeline. If you’re working with serverless data workflows, this pattern is easily adaptable to product analytics, user activity tracking, sales dashboards, and more.

Continue reading...

Automating Publisher Reporting on AWS: A Serverless Architecture with Slack Alerts

Gurudev Prasad Teketi

Guest

Overview​

Architecture Diagram​

Project Structure​

Code Snippets​

Scheduling​

Challenges & Fixes​

Impact​

Overview

Architecture Diagram

Project Structure

Code Snippets

Scheduling

Challenges & Fixes

Impact