n8n daily email insight generator

  • Thread starter Thread starter Matheus D. Santos
  • Start date Start date
M

Matheus D. Santos

Guest

n8n Daily Email Insight Generator β€” Real-Time Triage & Bright Data Enrichment​


Submission for the AI Agents Challenge powered by n8n and Bright Data
Owner:
Matheus Santos β€” Version 0.2
Timeline: Aug 24, 2025 β†’ Aug 31, 2025
Status: Completed (prototype)

Overview​


I built an n8n workflow that automatically ingests new emails, classifies and prioritizes them with an LLM (Gemini), enriches results using Bright Data Marketplace datasets, and writes concise, actionable entries into a Google Sheet. The workflow runs on a schedule (three times/day) and produces an β€œemail insights” sheet that teams can use for triage, tracking, and follow-up.

What I built​


A production-style automation that:

  • pulls emails via IMAP,
  • cleans and normalizes message payloads,
  • classifies topic & priority with a chat model,
  • enriches classification with Bright Data Marketplace dataset lookups,
  • merges and finalizes the enriched output with an LLM pass,
  • and writes tasks, dataset evidence, and actions into Google Sheets (create/update/append).

Key requirement satisfied: the workflow uses n8n’s AI/Chat model nodes and Bright Data Verified Node(s) (both used as part of the enrichment & toolchain).

Live demo & workflow​

If you want to try it quickly: import the workflow JSON into your n8n instance, configure IMAP, Bright Data, Gemini, and Google Sheets credentials, then run a test email.

Why this is useful​

  • Saves time by turning inbox noise into action items and evidence.
  • Surface business-critical mail (investment, billing, client requests) with a consistent triage policy.
  • Provides provenance: Bright Data dataset matches are logged so you can verify sender/company context.
  • Easily extensible: add more enrichment sources, task integrations (Asana/Trello), or dashboards.

Architecture (high level)​


Code:
[Email Provider/IMAP Trigger]
        ↓
[Email Cleaning Function]  (extract snippet, from, to, date, subject)
        ↓
[Chat Model: Classify (Gemini)] β†’ (validate & clean)
        ↓
[Branch A: Prepare Bright Data queries]
        ↓
[Bright Data: List Datasets / Marketplace Lookups] β†’ [Clean results]
        ↓
[Merge: classification + enrichment]
        ↓
[Chat Model: Final enrichment / summarize (Gemini) β€” may call Bright Data as a tool]
        ↓
[Final Cleaning function]
        ↓
[Chat Model β†’ Google Sheets op generator]
        ↓
[Google Sheets: Tasks, DatasetMatches, EmailActions]

Technical implementation​

  1. IMAP Email Trigger β€” watches inbox for new messages.
  2. Function node β€” Email Content Cleaning β€” normalizes the raw payload to { content, From, To, subject, dates }.
  3. Chat model (Gemini) β€” Categorize β€” prompt classifies the email into one of the categories and assigns priority (High / Medium / Low). Output is strict JSON (so downstream parsing is reliable).
  4. Output cleaning β€” function removes fences/extra characters from model output.
  5. Decision & enrichment prep β€” an LLM checks whether the classification has enough signals; if not, it suggests Bright Data queries (domains, keywords).
  6. Bright Data β€” Marketplace (list datasets) β€” uses suggested queries to retrieve datasets to validate or enrich classification.
  7. Enrichment cleaning β€” function strips unnecessary fields (IDs) and normalizes dataset objects.
  8. Merge β€” combines classification JSON with dataset results.
  9. Final summarization (Gemini) β€” produces final summary, recommended actions, and Google Sheets tasks.
  10. Google Sheets operations β€” a final chat model node or function maps final JSON to append/update rows in Tasks, DatasetMatches, and EmailActions.

Example of what the workflow writes (example JSON)​


Code:
{
  "category": "Investment",
  "priority": "High",
  "confidence": 0.92,
  "summary": "OLITEF 2025 registration closing soon β€” investment education program from Tesouro Direto; consider applying before deadline.",
  "reasons": ["sender domain: tesourodireto.com", "contains 'InscriΓ§Γ΅es' and deadline-like phrase"],
  "suggested_enrichments": [
    { "type": "domain_lookup", "query": "tesourodireto.com", "purpose": "verify official sender" },
    { "type": "web_search", "query": "OLITEF 2025 Tesouro Direto inscriΓ§Γ΅es", "purpose": "confirm deadline" }
  ],
  "actions": [
    { "type":"label","detail":"Investment,High","urgency":"immediate" },
    { "type":"create_task","detail":"Review OLITEF registration page","urgency":"routine" }
  ]
}

Bright Data usage (what I did)​

  • Marketplace / list datasets β€” query for company profiles, news, and web snapshots that match inferred domains/keywords from emails.
  • Tool usage inside LLM β€” the LLM is allowed to request Bright Data as a verification tool for low-confidence or Investment-class emails.
  • Why it matters: Bright Data adds real-time web signals (company pages, public announcements) to reduce false positives and raise model confidence.

How to reproduce / run locally​

  1. Clone repo and import /workflows/n8n Email categorize automation.json.
  2. Add credentials: IMAP, Gemini (or other LLM), Bright Data (verified node), Google Sheets, Slack (optional).
  3. Adjust Priority Rules (JSON/Set node) to include your high-priority senders & keywords.
  4. Run the workflow manually or enable schedule (every 8 hours β†’ 3Γ—/day) and test with sample messages.
  5. Export workflow JSON & Gist when submitting.

Challenges & learnings​

  • Tuning prompts is critical β€” very large prompts or mixed content sometimes returned noisy output; I solved that by enforcing strict JSON outputs and adding cleaning nodes.
  • Bright Data Marketplace required trial & exploration β€” choosing the right dataset filters matters (domain-first approach helps).
  • Doing this challenge was incredible, it was my first time using n8n and bright data and I couldn't be more impressed with the power of these tools together. One of the challenges I faced was using the Chat Model nodes, sometimes I asked too much or send a big prompt, and they didn't return useful insight or output. Another challenge was learning how to use bright data, like I said I've used that, so I didn't know what to do at first. :) I am really happy with my project, of course there are things to improve, but it is a great start. I hope this project help someone else, feel free to reach out and talk about it. Not only that, but I love to learn, and that's a stunning topic to talk about!

Links & contact​

  • Repo: repository
  • Contact: Matheus Santos β€” feel free to reach out for questions or collaboration!

Continue reading...
 


Join 𝕋𝕄𝕋 on Telegram
Channel PREVIEW:
Back
Top