3,798 Changes

I built MemoryDial in 2016 because the web kept lying to me.

Not maliciously. Pages just changed. A headline rewrote itself between morning coffee and lunch. A price shifted overnight. A paragraph vanished like it never existed. I wanted a system that would remember what the web looked like yesterday, even when the web itself refused to.

The concept was simple. You give it a URL and a CSS selector. It watches. Every five minutes, Celery workers fan out across fourteen websites, pull the HTML, parse it with lxml and BeautifulSoup, compare it to what they found last time. If something changed, the system logs it. I called these monitored items “Cogs” because that's what they were: small, reliable, mechanical things that turned without supervision.

Django 1.7 ran the first version. SQLite held the data. RabbitMQ kept the task queue honest. The whole thing fit on a single server and cost almost nothing to run.

That was ten years ago.

The system is still running. Three thousand, seven hundred and ninety-eight content changes captured since I first deployed it. Nine user accounts. Fourteen monitored sites. The Cogs keep turning.

I've upgraded the Django version three times underneath it: 1.7 to 3.2, then to 5.2.5. Each migration was its own archaeology project. South migrations became Django migrations. Python 2 idioms became Python 3. Template tags changed syntax. But the core data survived every transition. Every content change from 2016 sits in the same SQLite database that holds yesterday's scrape results.

Two assumptions held up. The selectors didn't.

A Tuesday in 2019. I checked the dashboard and found a flatline. A news site had redesigned overnight, moved their headline from `div.story-header h1` to `article header .headline-text`. The Cog didn't error. It just went silent. The element it watched no longer existed at that address. Three days of changes, gone.

I cracked open DevTools, found the new selector, updated the config. Twenty minutes. The Cog started turning again.

Six weeks later the same site changed layouts. Same drill. DevTools, new selector, twenty minutes. Then another site redesigned. Then another. I spent more time fixing selectors than I spent building the original system.

This is the part where I'm supposed to say I solved the problem elegantly. I didn't. I just kept fixing them. For years. The monitoring worked because I was stubborn, not because the approach was sound.

But the experience left a mark.

Then GPT arrived. And I understood immediately what it meant for this kind of work.

A model doesn't need a CSS path. It reads a page the way a human does: by context, not by address. The headline is the headline because it reads like one, not because it sits inside a particular div. Point it at a page, describe what you want, and it extracts it. No selectors. No breakage when layouts change. No flatlines.

That shift, from “tell the machine where to look” to “let the machine see what it's looking at,” is the arc I lived through. I wrote the selectors. I fixed them when they broke. I maintained the system that needed me to understand the DOM of every site I monitored. Now I build systems that don't need any of that.

But I keep MemoryDial running. Partly because it still works. Partly because those 3,798 changes represent something I care about: the idea that the web has a memory problem, and software can fix it. The pages themselves forget. Someone should remember.

That's why this blog carries the name. MemoryDial isn't a product. It's a belief that got encoded into a Django app one weekend in 2016 and never stopped running. The web changes constantly. Most of those changes disappear. I built a small, stubborn machine that catches them before they're gone.

The Cogs are still turning. They'll turn tomorrow too.