IdeaBot β€” a YouTube-driven Viral Topic Generator.

  • Thread starter Thread starter Shah Pourazadi
  • Start date Start date
S

Shah Pourazadi

Guest
This is a submission for the AI Agents Challenge powered by n8n and Bright Data

What I Built​


IdeaBot β€” a YouTube-driven Viral Topic Generator.
Give it any topic (e.g., β€œAI & Automation”) and it will:

1) Find trending YouTube results;
2) pull transcripts and top comments;
3) analyze patterns;
4) generate suggested titles, short-form ideas, mini-scripts (3–5 lines), and social post drafts.

Problem it solves: creators waste time guessing what will resonate. IdeaBot grounds ideation in what audiences are already engaging with (comments) and what’s working now (recent videos), then turns that signal into publish-ready prompts and snippets.

Demo​


Public Chat (n8n Chat Trigger): https://flow.wemakeflow.com/webhook/ac48b635-40d1-4c83-aa5a-fbf2cb5ba546/chat

n8n Workflow​


Workflow JSON (Gist): https://gist.github.com/azadishahab/677424d5a84f570ebbf2fb83544119b6

Technical Implementation​


Agent & Model Setup
Chat entrypoint: When chat message received.

Models: Google Gemini Chat Model nodes power the agents.

Google Gemini Chat Model1 β†’ explicitly set to models/gemini-2.5-flash-lite (drives the URL-builder agent).

Google Gemini Chat Model β†’ default Gemini chat model (drives parsing/summarization/repurposing agents).

Agents (system instructions):

AI Agent1 – SERP URL Builder. Generates a Google video search URL:
https://www.google.com/search?q=&tbm=vid&gl=
– comes from the user prompt; is 2-letter country (defaults to us if not specified).
– Output contract: URL only (no extra text).

AI Agent2 – SERP Result Parser. Input: raw SERP payload. Task: extract YouTube video URLs and return them as an array.

AI Agent – Transcript Summarizer. Input: video metadata/transcripts. Task: summarize each transcript into key notes for downstream repurposing.

AI Agent3 – Content Repurposer. Input: transcript summaries + high-signal comments. Task: generate new, original ideas (publish-ready JSON in the final responders).

Bright Data usage (nodes & flow)
Search (SERP):

Node: Access and extract data from a specific URL (Bright Data Verified).

serp_api1, country: us (default), url: {{$json.output}} (the URL built by AI Agent1).

Responds via Respond to Chat (β€œDone searching Google…”) to keep the chat user informed.

Video transcripts & metadata (YouTube – Video Posts dataset):

Node: Extract structured data from a single URL2 β†’ dataset β€œYoutube - Videos posts” (dataset_id: e.g., gd_lk56epmy2i5g7lzu0k).

Input URLs: {{ $('Respond to Chat1').item.json.output.toJsonString() }} (the array of video URLs extracted earlier).

Sort by views (desc) then likes (desc) β†’ Limit to 2 top videos.

Code node wraps those into { output: [{ url: ... }] } for consistent downstream shape.

YouTube comments (Comment Collector dataset) with polling:

Node: Extract structured data from a single URL1 β†’ dataset β€œYoutube - Comments” (dataset_id: e.g., gd_lk9q0ew71spt1mxywf).

Snapshot polling loop: Edit Fields1 (capture snapshot_id) β†’ Download the snapshot content β†’ If status == "running" β†’ Wait 6s β†’ loop back to Download until done.

Filter1: keep only comments with likes > 60 (noise reduction).

Aggregate: consolidate high-signal comment_text for analysis.

Data shaping & analysis pipeline
SERP URL Builder (AI Agent1) β†’ Bright Data SERP fetch β†’ AI Agent2 extracts an array of YouTube URLs β†’ Respond to Chat1 acknowledges URL collection.

Video Posts dataset (transcripts/metadata) β†’ Sort β†’ Limit (2) β†’ Code packaging β†’ Respond to Chat3 (status update) β†’ Comments dataset (with polling) β†’ Filter1 (likes>60) β†’ Aggregate (comment texts).

Summarization branch: the Limit node also feeds AI Agent (Summarizer) to create concise transcript summaries.

Merge:

Aggregate1 collects summarizer outputs;

Aggregate (comments) merges via Merge β†’ Aggregate2 (aggregateAllItemData) to a single payload.

Content generation: AI Agent3 (Repurposer) transforms summaries + comments into the final JSON ideas package.

Final reply: Respond to Chat2 returns the JSON object to the user.

Prompting & contracts (highlights from node configs)

URL Builder (Agent1): strict instruction to output only the correctly-formed SERP URL with tbm=vid and default gl=us.

Parser (Agent2): extract YouTube URLs array from SERP results (no prose).

Summarizer: β€œSummarize the video transcription, keep all important notes … used for content repurpose.”

Repurposer (Agent3): β€œYou are the Content Repurposer Agent… generate fresh, original content ideas based on video summaries + top comments.”

Final schema: returned by Respond to Chat as a JSON payload (titles, short-form ideas, mini-scripts, post drafts).

Memory / conversation behavior
Workflow uses Chat Trigger (public) with responseNodes mode and several Respond to Chat status messages.

There is no dedicated memory node in this export; each run is effectively stateless (refinements re-enter the flow). You can add an n8n Chat Memory Manager / Window Buffer Memory later if you want multi-turn refinement without re-scraping.

Notable safeguards & heuristics
Comment quality gate: likes > 60 to boost signal.

Top-video cap: Limit 2 (fast, token-efficient).

Polling loop: waits for Bright Data comment snapshots to complete before analysis.

Code shaping: wraps arrays into { output: [...] } so downstream Bright Data nodes accept uniform input.

Bright Data Verified Node​


How it’s used end-to-end:

SERP (video) fetch
Node: Access and extract data from a specific URL

serp_api1; gl defaults to us; URL pattern https://www.google.com/search?q=&tbm=vid&gl= generated upstream.

Output is handed to AI Agent to extract YouTube links.

Video Post (transcripts & metadata)
Node: Extract structured data from a single URL

Dataset: e.g., gd_lk56epmy2i5g7lzu0k (β€œYoutube - Videos posts”)

Flow: Sort (views, likes) β†’ Limit top 2..5 β†’ Code to produce {output:[{url:...}]} for downstream.

Comment Collector
Node: Extract structured data from a single URL

Dataset: e.g., gd_lk9q0ew71spt1mxywf (β€œYoutube - Comments”)

Snapshot poll: Edit Fields β†’ Wait β†’ Download snapshot content β†’ If (status=="running") loop back β†’ else continue.

Quality: Filter (e.g., likes > 60) β†’ Aggregate to merge comment text for analysis.

This pairing (SERP β†’ Video Post β†’ Comment Collector) yields fresh, structured inputs resilient to blocking, enabling reliable analysis and ideation.

Journey​


Process:
Started from a clear target: ideas tied to real audience demand.

Built a prompt→URL Builder so users can stay free-form while the system enforces SERP correctness (tbm=vid, gl default).

Split data collection into videos (transcripts) and comments, then layered agents: summarize, pattern-find, repurpose, respond.

Challenges & Solutions:
SERP parsing reliability: Solved by chaining a Bright Data SERP fetch with an LLM Structured Output Parser to normalize video URLs.

Snapshot polling for comments: Implemented a Wait + If loop to poll until snapshot completion, then filtered by likes for signal.

Token/length limits: Summarizer truncates transcripts; comments are filtered before aggregation.

Keeping outputs actionable: A dedicated Repurposer prompt that forces new ideas to be inspired by (not copied from) summaries + comments, then formats to a strict JSON schema.

What I learned:
Enforcing tool contracts (I/O shapes per agent) makes multi-agent flows robust.

Bright Data’s datasets + polling patterns are a clean fit for n8n; pairing them with lightweight LLM parsing yields dependable, real-time pipelines.

A small amount of structure (sorting by views/likes, comment like-thresholds) dramatically improves idea quality and virality potential.

Continue reading...
 


Join 𝕋𝕄𝕋 on Telegram
Channel PREVIEW:
Back
Top