memorydial

I decided to clone a ship's computer.

Holly is the shipboard AI on Red Dwarf. IQ of 6,000, self-reported. Three million years of solitary runtime left him a bit off. He announces the crew is about to die with the same tone he'd use to read out a shopping list. Technically correct about most things. Practically useless about all of them.

The Scraping Problem

The plan was straightforward. Grab every Red Dwarf script from Seasons 1 through 8, extract Holly's dialogue, and use it to build a chatbot that captures his personality. Then add Norman Lovett's voice. Then a talking head. Baby steps toward something deeply unnecessary and therefore worth doing.

Step one: get the scripts. A fan site called ladyofthecake.com has transcripts for all 51 episodes. I wrote a Python scraper expecting clean, consistent formatting. I got the opposite.

Seasons 1 through 5 use colon format. `HOLLY: dialogue goes here`. Readable, parseable, fine. Then Season 6 switches to a centered screenplay layout with character names floating 20 spaces from the left margin, dialogue indented beneath. Seasons 7 and 8 use a third variation: the character name sits alone on its own line, dialogue indented on the next.

Three formats across 51 scripts, written by different fans over the course of a decade. No standard. No schema. Just people who loved a show typing it up however felt right.

So one parser became three. The scraper auto-detects which format each script uses by counting structural patterns: colons for the first style, leading whitespace for the screenplay layout, name-then-indent for the third. It picks the format with the highest match count and routes to the right parser. Simple heuristic, surprisingly reliable.

Violets in Spring

Halfway through building the scraper, I found a post on r/RedDwarf. Someone had already built a Holly interface. A talking head on a screen, wired into their home automation system. Holly announcing the weather, controlling the lights, telling you it's time for lunch. It looked great.

My first reaction was the predictable one. That sinking feeling when you realize your original idea wasn't.

Then I remembered Elizabeth Gilbert's argument in Big Magic: ideas are living things. They circulate, looking for collaborators. If you're not available, they move on. Scientists call it multiple discovery. Newton and Leibniz both arrived at calculus. Darwin and Wallace both landed on natural selection. Bolyai and Lobachevsky both cracked non-Euclidean geometry on separate continents. Farkas Bolyai put it well: “When the time is ripe for certain things, they appear at different places, in the manner of violets coming to light in early spring.”

Don't rage, don't spiral. Grieve efficiently. Then get moving.

In this case, there wasn't much to grieve. The Reddit Holly is a skin. A UI layer over existing home automation, with Holly's face and voice doing what Alexa already does. What I'm building is different: a personality extraction pipeline. 704 lines of dialogue run through parsers, structured for fine-tuning, aimed at capturing the specific way Holly thinks. Not what Holly looks like on a screen, but how he constructs a sentence, misreads a room, and delivers catastrophe like it's a weather report. The Reddit version is Holly's face. Mine is Holly's brain.

Two violets, same spring, different gardens.

What 704 Lines Teach You

Once the parsers worked, I ran the full extraction. 704 Holly lines. 186 conversation snippets with surrounding context, capturing what the crew said before and after Holly spoke, because timing matters as much as words when you're modeling deadpan.

The output lands in four formats. A structured JSON with full conversation context. A training-ready JSON for fine-tuning. A readable text file with Holly's lines marked by `>>>` arrows. And a raw file of just Holly, 704 lines of uninterrupted monotone genius.

After 704 lines in sequence, the personality crystallizes. Holly almost always opens with “Well” or “Look.” He delivers catastrophic news as if commenting on the weather. He gets things wrong in ways that are technically defensible. He occasionally says something profound, then immediately undermines it.

Comedic timing isn't in the words. It's in the gap between what the situation demands and what Holly actually delivers. A system prompt can capture vocabulary and sentence structure. Capturing the principle that Holly should always slightly mismatch the emotional register of the conversation, that's harder.

The Pipeline

From an ML perspective, this is a personality modeling data pipeline. Web scraping with automatic format detection. Structured extraction that preserves conversational context. Multiple output formats for different downstream uses.

The format detection is the interesting bit. Each parser handles its own quirks. The colon parser deals with mixed-case character names and multi-line dialogue that wraps without a new character tag. The screenplay parser counts leading whitespace to distinguish character names from stage directions. The name-above parser has to separate dialogue indentation from scene description blocks. Each one strips stage directions, normalizes character names against a known roster of 30+ characters, and cleans the text.

None of this is novel. But it's the kind of unglamorous data work that sits beneath every interesting model. The quality of what comes out depends entirely on the care put into extraction. Feed garbage dialogue into a personality model and you get a chatbot that sounds like a blender full of scripts.

## What's Next

Phase 1 gave me clean data. Phase 2 is the personality prompt. I need to analyze those 704 lines for speech patterns, verbal tics, and the specific ways Holly constructs (and deconstructs) meaning. The goal is a system prompt that doesn't just sound like Holly but thinks like Holly, defaulting to gentle confusion and accidental wisdom.

Phase 3 is Norman Lovett's voice. Series 1, 2, and 8 are the canonical Holly, the original deadpan before Hattie Hayridge's run. A voice clone from those seasons should nail the flat, unimpressed delivery that makes Holly work.

Phase 4 is the talking head. An animated face that moves when Holly speaks. The visual component that turns a chatbot into a character.

I have no commercial reason to build this. Nobody asked for it. The entire project exists because I watched Red Dwarf as a kid in Ireland and Holly made me laugh harder than any other fictional computer. Now I have the tools to bring him to life, so I'm doing it.

Sometimes the best reason to build something is that it would be funny if it existed.

Holly would probably tell me I'm wasting my time. He'd be right, technically. And completely wrong in the way that matters.

The first morning it worked, I stood in my kitchen in Kitsilano holding a coffee I'd forgotten to drink. A voice I hadn't recorded was reading me the weather, the tides, and how many people were currently orbiting the Earth. It told me Liverpool play at noon, that I should bring a jacket, and that the sunset would hit 8:14 PM. Then it stopped. Forty-seven seconds of audio. I played it again.

I built a radio station for one listener. It runs on a Raspberry Pi, costs $1.29 a month, and broadcasts every morning at 7:00 AM to nobody but me.

The Problem With Smart Speakers

I used to ask Alexa for my morning briefing. If you've tried this, you know the drill. You get a news summary you didn't ask for, a weather report for the wrong part of the city, and a “fun fact” about pangolins. Optimized for engagement, not utility.

What I wanted was simple. Weather, tides, calendar. Delivered like a BBC Radio 4 newsreader who respects my time and doesn't try to make me smile. No jingles. No “and here's something interesting.” Just the facts, read well, in under two minutes.

So I built it.

The Stack Nobody Asked For

RaspberryFM is a Python script that wakes up via cron at 7:00 AM Pacific. It reaches out to a handful of sources, assembles a script, voices it, and emails me the transcript with a cost breakdown. The whole pipeline runs in about forty seconds.

Weather comes from Open-Meteo, which is free and accurate. Sunset times I calculate locally. Google Calendar events pull through the API. Sports fixtures for Liverpool, Leinster Rugby, and UFC numbered events get fetched and converted from GMT to Pacific.

Then there are the weird ones.

For tides, I scrape tideschart.com. For the space crew count, I scrape whoisinspace.com. Neither of these sites has an API. And this is where the project got interesting.

AI as a Parser

The conventional approach to web scraping is brittle. You inspect the page, find the CSS selectors, write a parser, and pray the site never changes its markup. I've written dozens of these. They break constantly. A single class name change and your pipeline dies at 6:58 AM while you're still asleep.

I tried something different. I feed the raw HTML to GPT-4o and ask it to extract what I need.

For tides, the prompt is straightforward: here's the HTML for Kitsilano Beach, tell me when the tide crosses 2 meters today and whether it's rising or falling. GPT reads the table, understands the context, and returns structured data. If the site changes its layout tomorrow, the model will still understand a tide table.

I originally used the WorldTides API for this. It was fine until I noticed the times were consistently fifteen minutes off from tideschart.com, the source I'd been trusting for years as a swimmer. Fifteen minutes matters when you're timing an ocean dip around a tidal window. So I cut the API, scraped the source I trusted, and let GPT parse it.

The space crew count works the same way. I send the HTML from whoisinspace.com to GPT-4o-mini and ask it to count the humans currently in orbit. This one is less reliable. The model sometimes miscounts, especially when the page lists crew members across multiple vehicles. I've accepted this. If the count is off by one, the briefing still works. And it's a good reminder that these models are probabilistic, not precise. Knowing where your system can afford to be wrong is half the engineering.

The insight here is small but useful: you don't always need an API. When a website is your source of truth, an LLM can parse it with a resilience that CSS selectors can't match. The model understands the meaning of the content, not just its position in the DOM.

Prompt Engineering as Tone Control

The script generation is where the personality lives. GPT-4o-mini writes a 1-2 minute radio script from the assembled data, and the prompt is mostly about what not to do.

No puns. No enthusiasm. No exclamation marks. No “here's something exciting.” The tone is BBC World Service at 6 AM: calm, precise, slightly dry. I spent more time tuning this prompt than writing the actual data pipeline. Getting an LLM to suppress its instinct toward cheerfulness is harder than getting it to be accurate.

The voice is OpenAI's TTS model using the “shimmer” voice, which has the right cadence for a news read. Clean, neutral, no vocal fry.

Graceful Degradation

The one architectural decision I'm proudest of is the failure handling. Every data source is wrapped in a try/except that returns a sensible default. If the weather API is down, you get “weather data unavailable.” If Google Calendar times out, the briefing skips your schedule. If every single source fails simultaneously, you still get a briefing. It'll be short and a bit empty, but it'll be there.

The whole point of a morning briefing is reliability. It runs at 7:00 AM whether I'm awake to check on it or not. If one failure in one source kills the entire pipeline, the system is worse than useless because now I have to build monitoring for my morning alarm clock. Graceful degradation means I trust it and forget about it.

## The Economics of One

The total cost is $0.043 per day. That's GPT-4o for the tide parsing, GPT-4o-mini for the crew count and script writing, and OpenAI TTS for the voice. Every transcript email includes a cost line so I can watch for drift.

$1.29 a month. Less than a coffee. For a personalized radio station that knows my calendar, my beach, and my teams.

Three things that stuck

First, LLMs are underused as parsers. Everyone's building chatbots and agents. Almost nobody is using these models to replace the fragile glue code that holds data pipelines together. A model that understands HTML semantically is more robust than a hundred lines of BeautifulSoup.

Second, prompt engineering is real engineering. The BBC tone didn't happen by accident. It took iteration, failed attempts, and a clear understanding of what the model defaults to when you don't constrain it. The prompt is a specification, and like any spec, precision matters.

Third, the best personal projects are ones you use every morning. RaspberryFM isn't a portfolio piece I built and abandoned. It's running right now. It ran this morning. It'll run tomorrow. That changes how you build it. You optimize for silence over noise, simplicity over cleverness, reliability over features.

The Pi sits on a shelf in my apartment, a small black box with a green light. Every morning at 7:00 AM it wakes up, looks at the world, and tells me what I need to know. Then it goes back to sleep.

I said I'd build it. Last post I wrote, “That's the next build.” The Eat Watch on my wrist worked, but it was deaf. I logged meals in MyFatnessPal, then tapped the same calories into the watch by hand. I was the middleware. Two systems, no bridge, me in the middle pressing buttons.

Then I built DogWatch.

DogWatch was supposed to be about counting dogs on the walk to daycare. It was. But it taught me plumbing. Data flowing from wrist to phone to server. A Garmin app that talked to Django. By the time the first walk synced, zero dogs and all, I had a pipeline.

If I could sync dog counts, I could sync calories.

The Build

The architecture is simple because the watch is stupid. On purpose.

Every five minutes, the Garmin sends one request to the server: give me today's numbers. The server checks what I've logged in MyFatnessPal, does the maths, and sends back three numbers. Goal. Consumed. Remaining.

The watch stores nothing. Calculates nothing. Decides nothing. It asks one question and displays the answer. Green means eat. Red means stop.

When I log a burrito at lunch, the server knows within five minutes. I don't open anything. I glance at my wrist. The number moved.

Midnight comes, the count starts fresh, and the watch goes green again. The first morning it worked, I just stood there looking at it. A zero I hadn't typed.

Walker called it a fuel gauge. The gauge doesn't know how the engine works. It just reads the tank.

The Skin

Walker never built the Eat Watch. But he drew one. In The Hacker's Diet he mocked up a watch face: square, black, a red-bordered LCD screen with “Marinchip Eat Watch” in italic script across the top. It looked like a Casio from 1985. A “Turbo Digital” badge sat at the bottom like a maker's mark on a thing that never existed.

I wanted mine to look like that. The problem was shape. Walker drew a rectangle. Garmin makes circles. So I redrew it: same bezels, same script, same badge, bent around a round face. The LCD tan, the red border, the italic branding. All of it, just curved.

Now it sits on my wrist. Green text, “EAT,” the remaining calories underneath. A relic from a future that never shipped, finally running on real hardware.

The Arc

A calorie counter. Then a Garmin app. Then a system to connect them. Each build was the logical next step, each question a little harder than the last. Could I build something useful? Could I build for hardware? Could I wire it all together?

The answer kept being yes.

The calorie counter talks to the watch. Loop closed.

I look at my wrist. Green. I can eat.

Walker imagined this in 1991. He never had the watch. I do.

-—

If you want to try this yourself:

FatWatch is a Garmin watch face that connects to MyFatnessPal. If there's enough interest I'll make both available. MyFatnessPal is the calorie counter that started all of this. You can read about it in the first post in this series.

The artwork came first. Lemmy Kilmister at a medieval loom, threading glowing data streams through wooden warp and weft, the Motorhead Snaggletooth materializing in the weave. I commissioned it before I'd written a line of code. Sometimes you name the thing before you build it, and the name tells you what it wants to be.

The name is a triple pun. Lemmy the software: a federated Reddit alternative. Lemmy the man: the Motorhead frontman who never compromised. And a loom: a machine that weaves raw threads into something you can use. I wanted to weave my own feed. On my own hardware. No algorithm deciding what I'd see.

I was spending too much time on Reddit. Not learning. Not connecting. Scrolling. The thumb moves, the feed refreshes, the dopamine drips. You close the app and can't name a single thing you read. This is the product working as designed. Reddit wants time on site. I wanted my time back.

The usual advice is discipline. Screen time limits. App blockers. Willpower. I've tried all of them. They treat the symptom. The problem is the feed itself: someone else decides what's in it, and their incentives aren't yours.

So I decided to build my own.

Lemmy is open source, federated, self-hostable. You can run your own instance, create your own communities, populate them however you want. My plan was simple. Spin up Lemmy on my Raspberry Pi. Write a Django bot that monitors RSS feeds, checks for new items, and auto-posts them to my communities. Lex Fridman drops a new episode, it appears in `/c/podcasts`. A blog I follow publishes, it lands in `/c/reading`. The feed fills itself. I scroll my own instance instead of Reddit, and everything in it is something I chose.

The bot was the easy part. Python, RSS parsing, the Lemmy API, a SQLite database to track what's been posted. Fifty lines of real logic. Django admin for managing feeds. The architecture fit on a napkin.

The infrastructure was the hard part.

I tried the Asustor NAS first. ARM-based, 824 megabytes of RAM per container. I got all six Lemmy containers running: nginx, lemmy-ui, the Lemmy backend, Postgres, pictrs for images, postfix for email. Docker reported them all healthy. Green lights across the board.

Then I opened a browser. Connection timed out.

The nginx container was listening on port 8536 internally. Docker mapped it to 10633. The lemmy-ui was configured to talk to `lemmy.ml` with HTTPS enabled. I was running on `localhost` with no certificate. Three layers of misconfiguration, each one invisible until you traced the whole chain.

I tried port 80. Already in use. I tried 8080. Connection timed out. I ran `docker logs` on every container. All healthy. All running. All unreachable.

The NAS was BusyBox-based. Half the diagnostic tools I needed didn't exist. `free -h` threw a syntax error. I was debugging container networking on a system where I couldn't check memory usage. I didn't know how to edit the nginx config because I didn't know where Docker had mounted it. I didn't know vi commands. I was learning infrastructure in production, on hardware that wasn't designed for it.

Then I tried the Raspberry Pi. Fresh start. Pi 4, 8 gigs of RAM, Debian underneath. Installed Docker, installed Tailscale, installed byobu. Got Docker Compose running. Downloaded the official Lemmy docker-compose from GitHub.

The config file URL returned 404. The repo had restructured since the docs were written. The lemmy.hjson I needed didn't exist at the path the documentation specified.

I rebuilt the configs from the official install guide, cross-referencing three different sources. Got the containers up. Got nginx configured. Got the ports mapped. And then it was midnight and I had work in the morning and the browser still wasn't loading the login page.

The RSS bot, by contrast, took an afternoon.

Monitor feeds on a schedule. Parse new items. Check the database for duplicates. Post to the Lemmy API. Track what's been posted. Django admin to add and remove feeds. The workflow was four steps: monitor, compare, post, track.

I had the posting mechanism working before I had a Lemmy instance to post to. The irony wasn't lost on me. The application logic was trivial. The platform it needed to run on was the entire problem.

The vision went beyond RSS. I wanted to run Fabric prompts against incoming content. Daniel Miessler's library of 200+ AI analysis patterns, applied to every article that landed in my feed. Extract the wisdom from a long essay. Summarize a paper. Analyze the claims in an opinion piece. The feed wouldn't just collect content. It would process it.

A personal Lemmy instance, populated by bots, analyzed by AI, running on a Pi under my desk. My own social network for an audience of one. No ads, no algorithm, no engagement metrics. Just the things I chose to follow, filtered through the lenses I chose to apply.

The gap between “I'll just self-host it” and a working system is where most projects like this die. Not because the idea is wrong. Because the infrastructure asks more of you than the application does. Docker networking, reverse proxies, port mapping, ARM compatibility, config file archaeology. None of this is the thing you set out to build. All of it stands between you and the thing working.

I learned more about nginx in three nights than in ten years of professional work. I learned that Docker containers can all report healthy while nothing is reachable. I learned that documentation rots faster than code. I learned that vi starts in normal mode and you press `i` to type.

The loom isn't finished. The threads are there: the bot works, the containers run, the vision is clear. What's missing is the last stretch of infrastructure that turns six healthy containers into a page that loads in a browser. It's close. It's been close for a while.

Lemmy would approve. He never did anything the easy way either.

Father and son walking to daycare, counting dogs along the way

Every morning my son and I walk to daycare through quiet streets.

We have one rule, his rule: when we see a dog, we stop and we look.

Some days we see the man with three St. Bernards. A woman with two Bernese Mountain Dogs. Last Friday she only had one. We looked worried.

“Tiberius is sick today. He will be around next week.”

When a big dog appears, my son kicks his legs and looks at me with an expression that says are you seeing this.

We do not care for small dogs.

We started keeping count. Three or more was a good start to the day.

But the question we could not shake: which route has the most dogs?

The Grid

The grid: six streets north-south, four avenues east-west. Street E is the main road. We can only cross it at the lights. Everything else is fair game.

Twelve valid routes from home to daycare, each shaped by where we can cross.

We had no data.

I needed a way to pick a random route, count dogs as I walked, and see the results later. The Garmin on my wrist could do all of it.

The Build

He was napping.

I wanted a Garmin app that would tell me which way to go, count dogs with up and down buttons for false positives or dogs deemed too small, and sync to a dashboard on my laptop.

Garmin apps are written in MonkeyC. MonkeyDo runs them. I chose Python and Django for the backend. The watch would sync through the Garmin Connect app.

The watch needed to pick a random route. We can only cross Street E at the lights on 4th and 2nd. Everything else is fair game. I mapped twenty-four possible routes.

I worked on the spec and tried to catch any edge cases, what happens when there's no connection? How do I know which leg of the route I'm on?

I fed it into Claude Code's planning mode. Reviewed. Then enabled yolo mode and let it run.

The app worked in the simulator on the first try. I adjusted fonts and configured ngrok so data could flow from watch to server.

I synced a test walk. The terminal lit up. The dashboard updated.

One kilometer. Ten dogs. Ten seconds.

Test data. Ten dogs in ten seconds would be a miracle, a stampede, or the best day ever.

The Walk

I deployed it to the Raspberry Pi that weekend and tested the sync one more time. Then we went on with our weekend.

Monday morning. First real test.

My son in the stroller. Watch on my wrist showing Route 1. Left for 200 metres.

We walked. I watched for dogs. He pointed at trees, birds, a crack in the sidewalk.

No dogs.

Not one.

The entire route, start to finish, and not a single dog. No St. Bernards. No Tiberius. No doodles. Not even the small ones we ignore.

I ended the walk. The watch synced. The dashboard updated.

Total walks: one. Total dogs: zero.


I forgot the character for “horse” three times in one week.

Not a complex character. Not some obscure radical buried in a classical text. 马. Six strokes. One of the first hundred you learn. I'd studied it, reviewed it, written it out by hand. And still, on Wednesday morning, staring at a flashcard, my brain served up nothing. Just white space where a horse should have been.

This is the central problem of learning Mandarin as an adult. The characters don't stick. You learn thirty in a week, forget twenty by Friday. Anki helps, but Anki is a blunt instrument. It knows when to show you a card again. It doesn't know how you remembered it, or why you forgot it, or what confused you. It treats every character like every other character: a fact to be drilled until it survives.

The Mandarin Blueprint method works differently. Every character becomes a scene in a movie playing inside a building you already know. An actor performs an action with a prop in a specific room. 马 isn't an abstract shape. It's Sean Connery kicking a toy horse across the kitchen of your childhood home. The pinyin maps to the actor, the tone maps to the room, the meaning maps to the action. You don't memorize the character. You watch the scene and the character reassembles itself in your mind.

The problem: no existing SRS tool understood any of this.

Anki showed me flashcards. It had no concept of actors, sets, rooms, or props. I couldn't search for “every character where Brendan Lee is the actor” or “every story set in my parents' gaff.” The mnemonic system lived in my head and on Traverse.link. The review system lived in Anki. The two never talked to each other.

So I built the bridge.

Mandarin Scaffold runs on Django and HTMX. It holds 1,959 stories across 67 levels, each one a complete movie scene: the character, its pinyin, its keyword, the actor who performs the action, the set where it happens, the room within that set, the prop that carries the meaning, and the story text that stitches it all together. I scraped the stories from Traverse, parsed the JSON, mapped actors to phonetic components, mapped sets to tone positions, and loaded the whole system into a SQLite database that fits in my pocket.

The heart of it is SM-2, the same spaced repetition algorithm that powers Anki. But wrapped in context. When I review a card, I don't just see the character and guess the meaning. I see the full scene: the actor, the set, the room, the prop. I can rate my recall from Again to Easy, and the algorithm adjusts the interval. A card I nail gets pushed out four days, then ten, then twenty-six. A card I fumble comes back in the same session.

The keyboard shortcuts make sessions fast. Space to reveal, 1 through 4 to grade, N for next, C for confused. No mouse. No clicking through menus. Just the character, the scene, and a number.

Then the features started accumulating. Sentence cards with cloze deletion: a full Chinese sentence with the target character blanked out, forcing recall in context rather than isolation. A deep dive mode where I can drill into a character, see similar characters that confuse me, generate example sentences through the OpenAI API, and save the ones that help. Level paragraphs that compose sentences using only the characters from a single level, so I can read something coherent instead of reviewing isolated fragments.

The AI integration started small. Claude Haiku generates example sentences for each character, three per hanzi, graded by difficulty. The system tracks token usage and cost per call, with hard limits, because I've seen what happens when you let an API run unmonitored.

589 hanzi have props now. 550 have the actual Mandarin Blueprint mnemonic props mapped. The data is getting dense enough that patterns emerge. I can see which actors I confuse, which sets share too many similar stories, which rooms need stronger visual anchors. The system is starting to teach me things about my own memory that I couldn't see from inside the learning process.

There's a statistics dashboard. Review history. Ease factor trends. Cards due by day. The kind of data that turns language learning from a feeling into a measurement.

I still forget characters. That hasn't changed. But now when I forget 马, I know why. The scene wasn't vivid enough. The prop was generic. The room was crowded with other characters that share the same actor. Scaffold shows me these patterns. It shows me the last time I reviewed, how I graded it, how the ease factor has drifted over time. Forgetting stops being a failure and starts being data.

Building your own learning tool is an odd kind of discipline. You spend as much time on the tool as on the thing you're learning. Some nights I'd catch myself optimizing a database query when I should have been reviewing level 14. But the tool compounds. Every hour I spend on Scaffold saves me minutes across thousands of future reviews. And the engineering itself reinforces the learning: I think about character structure while designing data models, about memory while writing algorithms, about cognition while building interfaces.

1,959 stories. 67 levels. One SQLite database. The characters are in there, waiting. Every morning I open the dashboard, see what's due, and start reviewing. The scenes play. The actors perform. The props do their work. And slowly, one review at a time, the characters stop being shapes and start being words.

The horse hasn't come back to haunt me in months.

I built MemoryDial in 2016 because the web kept lying to me.

Not maliciously. Pages just changed. A headline rewrote itself between morning coffee and lunch. A price shifted overnight. A paragraph vanished like it never existed. I wanted a system that would remember what the web looked like yesterday, even when the web itself refused to.

The concept was simple. You give it a URL and a CSS selector. It watches. Every five minutes, Celery workers fan out across fourteen websites, pull the HTML, parse it with lxml and BeautifulSoup, compare it to what they found last time. If something changed, the system logs it. I called these monitored items “Cogs” because that's what they were: small, reliable, mechanical things that turned without supervision.

Django 1.7 ran the first version. SQLite held the data. RabbitMQ kept the task queue honest. The whole thing fit on a single server and cost almost nothing to run.

That was ten years ago.

The system is still running. Three thousand, seven hundred and ninety-eight content changes captured since I first deployed it. Nine user accounts. Fourteen monitored sites. The Cogs keep turning.

I've upgraded the Django version three times underneath it: 1.7 to 3.2, then to 5.2.5. Each migration was its own archaeology project. South migrations became Django migrations. Python 2 idioms became Python 3. Template tags changed syntax. But the core data survived every transition. Every content change from 2016 sits in the same SQLite database that holds yesterday's scrape results.

Two assumptions held up. The selectors didn't.

A Tuesday in 2019. I checked the dashboard and found a flatline. A news site had redesigned overnight, moved their headline from `div.story-header h1` to `article header .headline-text`. The Cog didn't error. It just went silent. The element it watched no longer existed at that address. Three days of changes, gone.

I cracked open DevTools, found the new selector, updated the config. Twenty minutes. The Cog started turning again.

Six weeks later the same site changed layouts. Same drill. DevTools, new selector, twenty minutes. Then another site redesigned. Then another. I spent more time fixing selectors than I spent building the original system.

This is the part where I'm supposed to say I solved the problem elegantly. I didn't. I just kept fixing them. For years. The monitoring worked because I was stubborn, not because the approach was sound.

But the experience left a mark.

Then GPT arrived. And I understood immediately what it meant for this kind of work.

A model doesn't need a CSS path. It reads a page the way a human does: by context, not by address. The headline is the headline because it reads like one, not because it sits inside a particular div. Point it at a page, describe what you want, and it extracts it. No selectors. No breakage when layouts change. No flatlines.

That shift, from “tell the machine where to look” to “let the machine see what it's looking at,” is the arc I lived through. I wrote the selectors. I fixed them when they broke. I maintained the system that needed me to understand the DOM of every site I monitored. Now I build systems that don't need any of that.

But I keep MemoryDial running. Partly because it still works. Partly because those 3,798 changes represent something I care about: the idea that the web has a memory problem, and software can fix it. The pages themselves forget. Someone should remember.

That's why this blog carries the name. MemoryDial isn't a product. It's a belief that got encoded into a Django app one weekend in 2016 and never stopped running. The web changes constantly. Most of those changes disappear. I built a small, stubborn machine that catches them before they're gone.

The Cogs are still turning. They'll turn tomorrow too.

The watch tells you to stop eating. No willpower required. Just a signal, like a fuel gauge.

John Walker called it the Eat Watch. The AutoCAD founder wrote The Hacker's Diet in 1991. An engineer's approach to the body: treat it as a system. We are bags of water. Calories go in, calories go out. Weight is the integral of the difference. Why not fat?

Walker proposed a thought experiment. You set a calorie budget for the day. The watch counts down as you eat. When you hit zero, it tells you to stop.

He never built it. This was 1991. He ran the numbers on paper, then spreadsheets.

So I built it. I'd never touched Garmin development or heard of MonkeyC before. I opened Claude, started the spec, and three hours later I had an app on my wrist.

The app: a daily calorie budget, a reset hour. The watch stores both. One word at the top: EAT in green. One number below: calories remaining. When the number hits zero, the text changes to STOP in red. At your reset hour, it starts fresh.

I look at my wrist. Green. I can eat.

It's simple. Tap to add or subtract calories. The watch isn't connected to anything. It only knows what I tell it.

MyFitnessPal has the real numbers. Every meal logged, every snack timestamped. If I can sync that data to my wrist, the Eat Watch becomes what Walker imagined: a closed loop.

That's the next build.


The Eat Watch app on a Garmin watch showing green EAT text and remaining calories

A white onion almost broke me.

I was cooking dinner, logging as I went. I weighed the onion, chopped it, threw it into the pan, then opened MyFitnessPal. The first result had more fat than a stick of butter.

I knew that was wrong. I spent the next five minutes hunting through a database of user-submitted shite, guessing which entry was accurate. For an onion.

Why am I still using this crap?

Right then I decided to build my own.

It came together with Claude Code pretty quick. A Django Web app, SQLite Database and a cheap GPT model deployed onto a raspberry pi and made accessable over the web via Tailscale.

You describe food in plain English. It returns reasonable nutrition data. Just type what you ate.

When I'm cooking, I know what's going in. Last week I made chili. I pasted the ingredients straight from a YouTube description: ground beef, kidney beans, tomatoes, onion, half a packet of American cheese. Told it six portions. It came back: 340 calories, 26g protein per bowl.

When I'm out, I don't know what's in the burrito. I don't need to. I type “chipotle chicken burrito, no sour cream” and it comes back: 650 calories, 42g protein, 80% confidence. Good enough. Logged and done.

Here's what the big apps miss: a rough estimate you record beats a precise measurement you don't. Tracking is a habit. Habits need to be easy.

Last week I wanted to re-log yesterday's breakfast with one click. Took an hour. Now it's there, because it's mine.

And mine doesn't store passwords in plain text.