Would You Rather: A 24/7 AI-Driven Twitch Game
I built a Twitch channel that runs a Would You Rather game show by itself. A local LLM generates the questions, TTS reads them out, viewers vote in chat, and an OBS overlay updates in real time. It runs unattended, around the clock, and I have to do nothing to keep it going.
01 The Idea
I wanted a Twitch channel that did not depend on me being awake, on camera, or even in the country. The format had to be simple enough that an LLM could generate it indefinitely without going stale. The delivery had to be hands-off, so the TTS reads the show, the OBS overlay renders the visuals, and the chat does the rest.
Would You Rather fits. The structure is rigid (two options, viewers pick one), the content is open-ended (the LLM can invent new categories forever), and the loop is short enough that the model can keep up. A question is generated, narrated, voted on, scored, and then the next one starts.
02 How It Works
The whole thing runs on one Python process, with a clean separation between the components.
Question generation. A scheduler rotates between three hand-written content categories (CursedSuperpowers, MoralGrey, UKChaos), each with its own prompt template. The local LLM (Ollama, any small model will do) gets a prompt asking for two options of comparable weight, with a short rationale that can be read aloud if the question needs context. If the LLM times out or returns malformed JSON, the scheduler falls back to a hard-coded bank of pre-written questions, so the show never goes silent.
TTS. Once a question is locked in, the question text (and optional rationale) goes to the TTS client. The default is Pocket TTS, with Soprano as an alternative. The audio is written to output/tts/latest.wav, and a fresh question always overwrites the file, so any OBS audio source pointed at it gets a clean "new content" cue.
Chat voting. A Twitch bot reads the channel chat. When the question is live, it accepts votes as either A/B or 1/2 or left/right, with a per-user lock so people cannot double-vote. The voting window is configurable, default 90 seconds, which is the threshold most "interactive" streamers aim for. When the window closes, the winner is announced in chat by the bot and the show moves on.
OBS overlay. A small aiohttp web server runs alongside the bot and serves a single page that reads output/state.json on every refresh. The page is a 50/50 split: option A on the left, option B on the right, with live vote tallies, a question counter, and a "now playing" indicator. Drop it into OBS as a 1920x1080 browser source and you have a full screen. Voting from OBS is also supported, with a /vote?choice=A&user=local endpoint for local testing without Twitch.
03 The Architecture
Twelve small Python modules, each with one job. No 800-line catch-all file. No mystery state. If something breaks, you can read the relevant module and figure it out in a few minutes.
main.pywires everything together and runs the asyncio event loopconfig.pyloads env vars into a typed dataclass, no magic strings elsewhereollama_client.pywraps the local Ollama server with timeouts and retriesquestion_bank.pymixes LLM-generated questions with the fallback bankscheduler.pydecides the next category, the next question, and when to switchshow_controller.pyowns the show state machine (idle, asking, voting, scoring, transitioning)state_store.pywritesstate.jsonandhistory.jsonlso the overlay and any external tools can see what is happeningtts_client.pyrenders audio, writes to disk, retries on failuretwitch_bot.pyandVoteTrackerparse chat and tally votes with a per-user lockweb_server.pyserves the OBS overlay and the local vote endpointscene_renderer.pybuilds the overlay HTML from the current question stateapp/categories/holds the three category modules, each in its own file so you can add a new category by dropping in a new file
The whole project is about 2,000 lines of Python. The state machine in show_controller.py is the heart of it. Once that is right, the rest is plumbing.
04 What I Learned About LLMs Running On A Loop
The single most useful thing I built was the fallback question bank. The local model is fast enough that it can usually keep up, but it goes off the rails about once every forty questions. The output will be malformed JSON, or one of the options will be weirdly longer than the other, or the two options will be the same thing rephrased. Without a fallback, the show just stops.
With a fallback, the show keeps going and the bad outputs are masked. I learned to treat the LLM as a content source, not as the source of truth. The state machine always knows what question is live, regardless of whether the model produced something usable.
The second thing: the categories matter more than the prompt. I wrote three category templates (CursedSuperpowers for absurd dilemmas, MoralGrey for ethical edge cases, UKChaos for culturally specific ones) and that gave the show three distinct flavours to cycle through. The same prompt without the category framing produces flat, generic questions. The category is the spice.
The third thing: keep the prompt small. My first version asked the model for rationale, alternative phrasings, and difficulty scores. The output got long, the TTS got long, and the show got boring. Stripping the prompt down to "give me two options, keep them the same length, do not explain" was the best change I made.
05 The OBS Side
Most of the work on the production side was getting OBS to behave. The browser source polls state.json every second or so, which is fine for a vote tally but a bit slow for the "new question just dropped" transition. I added a small WebSocket-style push from the show controller to the overlay for the moment a new question goes live, so the visual transition is instant.
The audio side is just a Media Source in OBS pointed at output/tts/latest.wav. The TTS client writes the file atomically (write to a temp file, then rename) so OBS never reads a half-written file. That single trick fixed every audio glitch I had.
I run the bot on a small box with OBS Studio in a windowed mode, and I push the OBS output to a Twitch account via the standard streaming flow. The whole rig pulls about 80 watts, runs cool, and survives week-long uptime.
06 The Honest Limits
The model is small. Questions are short. The TTS is local, which means the voices are clear but not the polished "streamer" sound you would get from ElevenLabs. If you are trying to compete with a fully produced stream, this is not it. If you are trying to run a low-key, always-on channel that does not depend on a human, this is fine.
Chat voting depends on Twitch being up, the bot token being valid, and viewers actually being in chat. On a low-traffic channel the show is mostly the LLM and TTS going through the motions, with votes only when somebody wanders in. That is the design. The show does not need viewers to keep running, only to count votes.
The category set is hard-coded to three flavours. Adding a fourth is a one-file change, but I have not bothered. Three is enough to keep the show from feeling repetitive on a long run.
07 Why I Built It
Because I wanted to see if I could. The answer turned out to be yes, and the codebase is small enough that I can come back to it in six months and still understand it. That is the bar I try to hold myself to: if I cannot read my own project after a long break, I have overbuilt it.
The bonus is that every piece of this is reusable. The state machine works for any "show with phases" loop, the TTS integration is portable, the Twitch bot is a clean module you can drop into anything that needs chat interaction, and the OBS overlay pattern (state.json + browser source) is a useful default for any always-on visual. I have used pieces of this in other projects since.
If you want a similar always-on system for a different idea (a daily quiz, a poll-of-the-day, a creative-prompt stream, anything with a loop), I can build it. The pattern is general. Tell me what you want to run unattended and I will tell you what is realistic.