Automating Publisher Reporting on AWS: A Serverless Architecture with Slack Alerts

  • Thread starter Thread starter Gurudev Prasad Teketi
  • Start date Start date
G

Gurudev Prasad Teketi

Guest

Overview​


In this article, I walk you through building an automated reporting pipeline using AWS services. The goal was to generate daily summary reports on publisher readership, detect missing metadata (like publishers), store the results as CSVs in S3, and deliver structured Slack notifications to internal stakeholders β€” all without manual intervention.

Architecture Diagram​




Workflow:
1.Amazon EventBridge (Scheduler) triggers the workflow daily.

2.Lambda Function #1: Report Generator

  • Runs a named query on Amazon Athena that calculates reading activity data.
  • Stores results in an S3 bucket using a structured naming convention.
  • Sends a Slack message once the report is ready.

3.Lambda Function #2: Publisher Summary Aggregator

  • Fetches the latest CSV report from S3.
  • Aggregates per-publisher read counts by quarter.
  • Posts a clean, readable table to Slack showing publisher performance.

4.Lambda Function #3: Missing Publisher Detector (optional)

  • Runs a separate Athena query to find books with missing publisher info.
  • Sends a notification to Slack with a direct link to the generated file in S3.

Project Structure​


Code:
publisher-reporting/
β”‚
β”œβ”€β”€ deploy.sh                         # Infra provisioning and Lambda packaging
β”œβ”€β”€ config.yaml                       # Runtime config (bucket names, cron schedules)
β”‚
β”œβ”€β”€ lambda/
β”‚   β”œβ”€β”€ report_generator/
β”‚   β”‚   └── handler.py                # Runs Athena query and stores report in S3
β”‚   β”œβ”€β”€ summary_report_notifier/
β”‚   β”‚   └── handler.py                # Aggregates data and posts to Slack
β”‚   └── missing_publisher_report/
β”‚       └── handler.py                # Detects and reports missing metadata
β”‚
└── terraform/                        # Infra as Code (Lambda, EventBridge, IAM, etc.)

Code Snippets​


Lambda: Report Generator


Code:
exec_response = athena.start_query_execution(
    QueryString=query_string,
    QueryExecutionContext={"Database": ATHENA_DATABASE},
    ResultConfiguration={
        "OutputLocation": f"s3://{ATHENA_OUTPUT_BUCKET}/temporary-athena-query-results/"
    }
)
...
s3.copy_object(
    Bucket=TARGET_REPORT_BUCKET,
    CopySource={"Bucket": source_bucket, "Key": source_key},
    Key=final_key
)

Lambda: Slack Summary Formatter


Code:
for row in reader:
    if row["book_read_counts"].strip().upper() == "TRUE":
        yq = row["year_quarter"].strip()
        pub = row["publisher"].strip() or "Unknown"
        counts[yq][pub] += 1

Scheduling​


Code:
+------------------------------+------------------------------------------+-------------------------+
| Lambda Function              | Purpose                                  | Schedule (Cron Format)  |
+------------------------------+------------------------------------------+-------------------------+
| report_generator             | Run Athena query and save CSV to S3      | cron(0 0 * * ? *)       |
| summary_report_notifier      | Read latest CSV and post Slack summary   | cron(10 0 * * ? *)      |
| missing_publisher_report     | Detect books without publisher info      | cron(15 0 * * ? *)      |
+------------------------------+------------------------------------------+-------------------------+

# Notes:
# - Times are in UTC
# - 10–15 min stagger prevents overlap and race conditions

Challenges & Fixes​


1.Slack showed same data daily

Issue:
Slack message was posting the same publisher data every day.

Fix:


Code:
EventBridge schedule was set to run only on the 1st of each month.
Updated to run daily using: cron(0 0 * * ? *)

2.No new publisher summary files after Aug 1

Issue:
S3 bucket had no updated files after August 1st.

Fix:


Code:
Found that the publisher report generator Lambda was not running.
Corrected the EventBridge schedule to trigger daily.

3.Lambda race condition

Issue:
The Slack posting Lambda was sometimes reading an older CSV file instead of the one just generated.

Fix:


Code:
Introduced a 10-minute delay between the generator Lambda and the Slack reporter Lambda using separate schedules.

4.Slack output was hard to read

Issue:
The publisher read counts in Slack were misaligned and difficult to follow.

Fix:


Code:
Formatted the message using Slack-compatible triple backticks (```

...

```) to show preformatted blocks.

5.S3 bucket getting cluttered

Issue:
Temporary Athena result files were crowding the output bucket.

Fix:


Code:
Moved results to a 'temporary-athena-query-results/' folder prefix
Added a lifecycle policy to auto-delete them after 3 days.

Impact​

  • Eliminated all manual report generation
  • Improved team visibility into reader engagement
  • Ensured scalable, serverless infrastructure using AWS best practices
  • Automated alerts improved issue tracking and data consistency

This project was a great example of combining Athena, Lambda, S3, and EventBridge into a cost-efficient, automated reporting pipeline. If you’re working with serverless data workflows, this pattern is easily adaptable to product analytics, user activity tracking, sales dashboards, and more.

Continue reading...
 


Join 𝕋𝕄𝕋 on Telegram
Channel PREVIEW:
Back
Top