Would You Rather: A 24/7 AI-Driven Twitch Game

01 The Idea

I wanted a Twitch channel that did not depend on me being awake, on camera, or even in the country. The format had to be simple enough that an LLM could generate it indefinitely without going stale. The delivery had to be hands-off, so the TTS reads the show, the OBS overlay renders the visuals, and the chat does the rest.

Would You Rather fits. The structure is rigid (two options, viewers pick one), the content is open-ended (the LLM can invent new categories forever), and the loop is short enough that the model can keep up. A question is generated, narrated, voted on, scored, and then the next one starts.

02 How It Works

The whole thing runs on one Python process, with a clean separation between the components.

Question generation. A scheduler rotates between three hand-written content categories (CursedSuperpowers, MoralGrey, UKChaos), each with its own prompt template. The local LLM (Ollama, any small model will do) gets a prompt asking for two options of comparable weight, with a short rationale that can be read aloud if the question needs context. If the LLM times out or returns malformed JSON, the scheduler falls back to a hard-coded bank of pre-written questions, so the show never goes silent.

TTS. Once a question is locked in, the question text (and optional rationale) goes to the TTS client. The default is Pocket TTS, with Soprano as an alternative. The audio is written to output/tts/latest.wav, and a fresh question always overwrites the file, so any OBS audio source pointed at it gets a clean "new content" cue.

Chat voting. A Twitch bot reads the channel chat. When the question is live, it accepts votes as either A/B or 1/2 or left/right, with a per-user lock so people cannot double-vote. The voting window is configurable, default 90 seconds, which is the threshold most "interactive" streamers aim for. When the window closes, the winner is announced in chat by the bot and the show moves on.

OBS overlay. A small aiohttp web server runs alongside the bot and serves a single page that reads output/state.json on every refresh. The page is a 50/50 split: option A on the left, option B on the right, with live vote tallies, a question counter, and a "now playing" indicator. Drop it into OBS as a 1920x1080 browser source and you have a full screen. Voting from OBS is also supported, with a /vote?choice=A&user=local endpoint for local testing without Twitch.

03 The Architecture

Twelve small Python modules, each with one job. No 800-line catch-all file. No mystery state. If something breaks, you can read the relevant module and figure it out in a few minutes.

main.py wires everything together and runs the asyncio event loop
config.py loads env vars into a typed dataclass, no magic strings elsewhere
ollama_client.py wraps the local Ollama server with timeouts and retries
question_bank.py mixes LLM-generated questions with the fallback bank
scheduler.py decides the next category, the next question, and when to switch
show_controller.py owns the show state machine (idle, asking, voting, scoring, transitioning)
state_store.py writes state.json and history.jsonl so the overlay and any external tools can see what is happening
tts_client.py renders audio, writes to disk, retries on failure
twitch_bot.py and VoteTracker parse chat and tally votes with a per-user lock
web_server.py serves the OBS overlay and the local vote endpoint
scene_renderer.py builds the overlay HTML from the current question state
app/categories/ holds the three category modules, each in its own file so you can add a new category by dropping in a new file

The whole project is about 2,000 lines of Python. The state machine in show_controller.py is the heart of it. Once that is right, the rest is plumbing.

04 What I Learned About LLMs Running On A Loop

The single most useful thing I built was the fallback question bank. The local model is fast enough that it can usually keep up, but it goes off the rails about once every forty questions. The output will be malformed JSON, or one of the options will be weirdly longer than the other, or the two options will be the same thing rephrased. Without a fallback, the show just stops.

With a fallback, the show keeps going and the bad outputs are masked. I learned to treat the LLM as a content source, not as the source of truth. The state machine always knows what question is live, regardless of whether the model produced something usable.

The second thing: the categories matter more than the prompt. I wrote three category templates (CursedSuperpowers for absurd dilemmas, MoralGrey for ethical edge cases, UKChaos for culturally specific ones) and that gave the show three distinct flavours to cycle through. The same prompt without the category framing produces flat, generic questions. The category is the spice.

The third thing: keep the prompt small. My first version asked the model for rationale, alternative phrasings, and difficulty scores. The output got long, the TTS got long, and the show got boring. Stripping the prompt down to "give me two options, keep them the same length, do not explain" was the best change I made.

05 The OBS Side

Most of the work on the production side was getting OBS to behave. The browser source polls state.json every second or so, which is fine for a vote tally but a bit slow for the "new question just dropped" transition. I added a small WebSocket-style push from the show controller to the overlay for the moment a new question goes live, so the visual transition is instant.

The audio side is just a Media Source in OBS pointed at output/tts/latest.wav. The TTS client writes the file atomically (write to a temp file, then rename) so OBS never reads a half-written file. That single trick fixed every audio glitch I had.

I run the bot on a small box with OBS Studio in a windowed mode, and I push the OBS output to a Twitch account via the standard streaming flow. The whole rig pulls about 80 watts, runs cool, and survives week-long uptime.

06 The Honest Limits

The model is small. Questions are short. The TTS is local, which means the voices are clear but not the polished "streamer" sound you would get from ElevenLabs. If you are trying to compete with a fully produced stream, this is not it. If you are trying to run a low-key, always-on channel that does not depend on a human, this is fine.

Chat voting depends on Twitch being up, the bot token being valid, and viewers actually being in chat. On a low-traffic channel the show is mostly the LLM and TTS going through the motions, with votes only when somebody wanders in. That is the design. The show does not need viewers to keep running, only to count votes.

The category set is hard-coded to three flavours. Adding a fourth is a one-file change, but I have not bothered. Three is enough to keep the show from feeling repetitive on a long run.

07 Why I Built It

Because I wanted to see if I could. The answer turned out to be yes, and the codebase is small enough that I can come back to it in six months and still understand it. That is the bar I try to hold myself to: if I cannot read my own project after a long break, I have overbuilt it.

The bonus is that every piece of this is reusable. The state machine works for any "show with phases" loop, the TTS integration is portable, the Twitch bot is a clean module you can drop into anything that needs chat interaction, and the OBS overlay pattern (state.json + browser source) is a useful default for any always-on visual. I have used pieces of this in other projects since.

If you want a similar always-on system for a different idea (a daily quiz, a poll-of-the-day, a creative-prompt stream, anything with a loop), I can build it. The pattern is general. Tell me what you want to run unattended and I will tell you what is realistic.

08 Related: My Windows Python Workflow

This bot runs 24/7 on a Windows machine. The asyncio plumbing, the OBS integration, the always-on service management: it is all done with the same Python workflow I have refined across years of similar projects. I wrote it up as a guide for anyone running their own local services: AJTheDev Windows Python Workflow Guide.