C
cutieyunny-tech
Guest
This is a submission for the Google AI Studio Multimodal Challenge*
I built "Whispurr: The Ghost Diner," an interactive mini-game that leverages the multimodal capabilities of the Gemini API. The project aims to create a dynamic narrative experience where the story changes based on the player's actions. The player controls a ghost that must navigate between different zones, with each zone triggering a unique AI response.

I used Google AI Studio to interface with the Gemini API. I leveraged the API to act as the "brain" behind the game. Instead of using a static, pre-written script, Gemini generates narratives and hints in real-time. This allows the game to provide a unique experience each time it's played, highlighting the power of generative AI for dynamic content creation.
I implemented the multimodal capabilities of Gemini 2.5 Flash to create a richer user experience.
Continue reading...
What I Built
I built "Whispurr: The Ghost Diner," an interactive mini-game that leverages the multimodal capabilities of the Gemini API. The project aims to create a dynamic narrative experience where the story changes based on the player's actions. The player controls a ghost that must navigate between different zones, with each zone triggering a unique AI response.
Demo

How I Used Google AI Studio
I used Google AI Studio to interface with the Gemini API. I leveraged the API to act as the "brain" behind the game. Instead of using a static, pre-written script, Gemini generates narratives and hints in real-time. This allows the game to provide a unique experience each time it's played, highlighting the power of generative AI for dynamic content creation.
Multimodal Features
I implemented the multimodal capabilities of Gemini 2.5 Flash to create a richer user experience.
- Text & Location-Based Narrative Understanding:
- When the player enters the Playful Zone, the AI generates a cheerful and humorous narrative, sometimes including patient "customer" reactions.
- When the player moves to the Dark Zone, the AI generates a scary narrative and mysterious whispers, completely changing the game's atmosphere. This demonstrates Gemini's ability to adapt tone and context based on user input.
- Image Understanding for Hints:
- In the Hint Zone, I demonstrate multimodal capability by sending both text and an image input to Gemini. While the image is a simulation, this process shows how the AI can "see" a visual input (e.g., "a key on a table") and generate a relevant narrative hint to help the player progress. This makes the interaction more meaningful and connected to the in-game visuals.
Continue reading...