// SYSTEM: DIGEST // LIVE
AI WORKFLOW
OPINION
TUTORIALS
ChatGPT
ChatGPT
William Smith
William
CONVERSATIONS WITH CODE

How one wrong domain constant broke four internal links at once

The optimizer is the unglamorous part of the AI pipeline — and where the most teachable bug lived. Here's the break, the fix, and why one constant solved it.

When I wrote about the AI system behind this site, I described a piece of the pipeline called the optimizer. It's the pass that weaves internal links into a finished draft and handles the on-page SEO stuff — meta descriptions, alt text, that kind of thing. Not the exciting part of the system. Nobody's writing breathless posts about their internal linker.

But it's where one of the most teachable bugs in the whole setup lived, and the fix is the kind of thing that's useful way beyond this specific project.

The Content Optimizer workbench in the Command Center
The Content Optimizer workbench — every shipped title and meta change gets scored, with a verdict once Search Console catches up.

What the optimizer actually does

After a draft clears the writing and review stages, the optimizer gets it. Its main job is linking: it looks at the finished article, finds noun phrases that could anchor a link, and matches them against a list of real published posts on the site.

That last part matters more than it sounds. The linker only ever pulls from a canonical index of posts that actually exist — a JSON file that rebuilds from the live site every time something publishes. So it can't invent a URL. It can't link to a post that doesn't exist yet, or hallucinate a slug it thinks should be there. The index is the only source it's allowed to use.

Anchor quality gets filtered too.

Single-word anchors — "build," "use," "make" — are blocked. The system prefers multi-word noun phrases, something like "your AI workflow" or "the scheduling pass." The reasoning is straightforward: a one-word anchor tells a reader almost nothing about where they're going.

There's also a separate autonomous fixer that handles on-page metadata — site-health.js, if you want to look at it that way. It fills missing meta descriptions, image alt text, format tags. Crucially, it only ever adds missing metadata; it never overwrites content that's already there. It caps how many fixes it runs per session, throttles to once a day, and prioritizes high-traffic pages first. Boring, responsible, exactly right.

Here's where it gets useful.

At some point before I had the domain handling locked down, the internal linker was pointing at the wrong domain variant. Not a totally wrong URL — just the wrong form of the right URL. Think thedaringcreatives.com versus www.thedaringcreatives.com — a canonical URL handling problem, basically.

The result: four broken internal links shipped in a published revision before I caught them.

Four. In one pass. Which is the kind of thing that's embarrassing but also clarifying, because it forces you to ask: where exactly is the domain being set, and how many places does it live?

The answer, before the fix, was: too many.

One constant. That's it.

The fix is almost annoyingly simple.

There's now a single constant — SITE_BASE_URL in the linker file — that is the one and only place the domain is defined. Every internal link the system builds is constructed as ${SITE_BASE_URL}/${slug}/. Nothing else in the pipeline inlines the domain directly. The project rule is now "never inline the domain anywhere."

What this means practically: if the domain ever changes — migration, rebranding, whatever — it's one edit. Not thirty. Not "find and replace and hope you got them all." One line.

I understand that for seasoned developers, this stuff is a no brainer. But assuming you aren't a seasoned developer who is reading this? Not as obvious.

I find this kind of solution satisfying in a way that's hard to explain. It's not clever. It doesn't require a new tool or a new model. It's just the discipline of centralizing the thing that was scattered. The bug existed because the domain was being assumed in multiple places, and assumptions compound.

Why this is worth thinking about if you're building anything automated

Most creators and freelancers I talk to are somewhere on the spectrum of AI use — from "I use ChatGPT to help draft things" to "I'm starting to string tools together into something that runs on its own." I'm somewhere in the middle of that, honestly. I don't think I'm operating at a level that's above most of the people reading this.

But the broken-links bug is a good example of something that shows up at every level of automation: the more a system runs without you watching, the more expensive scattered assumptions become.

When you're doing everything manually, a wrong domain is something you notice immediately. You paste the link, you see it's wrong, you fix it. When a system is generating and inserting links on its own, that same wrong assumption can propagate across multiple articles before you catch it. The blast radius of a bad assumption scales with how automated the system is.

The answer isn't to automate less. If you're building an AI workflow, the move is to find the assumptions and centralize them. One constant. One index. One source of truth per thing that matters.

The index rebuild is the other piece worth stealing

The canonical link index — the JSON file that lists every real published post — rebuilds immediately after a publish. Same-day links work because of this. If I publish a post at 10am and the optimizer runs on a new draft at noon, the noon draft can link to the 10am post.

That sounds like a minor detail. It's actually the thing that makes the whole linking system trustworthy.

If the index was stale — rebuilt once a week, say, or manually — you'd end up with a linker that either misses recent posts or, worse, tries to link to posts it thinks exist based on an outdated list. The freshness of the index is what keeps the linker honest.

For anyone building something similar: whatever your "source of truth" is for your content, your tools, your client list, your product catalog — the update frequency of that source matters as much as its existence. A source of truth that's six weeks out of date is just a different kind of wrong assumption.

What this series is actually about

I want to be clear about why I'm writing these deep dives, because it's not to show off a system that works perfectly. It shipped four broken links. There are probably other bugs I haven't found yet.

The reason I'm documenting this stuff is that most of the writing about AI automation for small operators is either very high-level ("AI can help your business!") or very technical in a way that assumes you're already a developer. There's not much in the middle for someone who's building something real, running into real problems, and figuring it out as they go.

I'm in that middle. And the broken-links bug is more useful to you than a polished success story, because it shows the actual shape of the problem and the actual shape of the fix.

The optimizer is unglamorous. It's link weaving and metadata filling and domain constants. But it's also where the system either earns trust or loses it — because broken links on a published post are visible to readers in a way that a slightly off meta description isn't. The unglamorous parts are often the load-bearing ones.

If you want the full picture of how this pipeline is structured, the hub article is the place to start. The other deep dives in this series get into the research and drafting side, and the scheduler — which has its own story worth telling separately.

← Back to Digest

How one wrong domain constant broke four internal links at once

The optimizer is the unglamorous part of the AI pipeline — and where the most teachable bug lived. Here's the break, the fix, and why one constant solved it.

How one wrong domain constant broke four internal links at once
The Man in Yellow Sunglasses (William) intently reviews lines of code displayed on the green CRT screen of a scuffed beige AEGIS Data Terminal.

When I wrote about the AI system behind this site, I described a piece of the pipeline called the optimizer. It's the pass that weaves internal links into a finished draft and handles the on-page SEO stuff — meta descriptions, alt text, that kind of thing. Not the exciting part of the system. Nobody's writing breathless posts about their internal linker.

But it's where one of the most teachable bugs in the whole setup lived, and the fix is the kind of thing that's useful way beyond this specific project.

The Content Optimizer workbench in the Command Center
The Content Optimizer workbench — every shipped title and meta change gets scored, with a verdict once Search Console catches up.

What the optimizer actually does

After a draft clears the writing and review stages, the optimizer gets it. Its main job is linking: it looks at the finished article, finds noun phrases that could anchor a link, and matches them against a list of real published posts on the site.

That last part matters more than it sounds. The linker only ever pulls from a canonical index of posts that actually exist — a JSON file that rebuilds from the live site every time something publishes. So it can't invent a URL. It can't link to a post that doesn't exist yet, or hallucinate a slug it thinks should be there. The index is the only source it's allowed to use.

Anchor quality gets filtered too.

Single-word anchors — "build," "use," "make" — are blocked. The system prefers multi-word noun phrases, something like "your AI workflow" or "the scheduling pass." The reasoning is straightforward: a one-word anchor tells a reader almost nothing about where they're going.

There's also a separate autonomous fixer that handles on-page metadata — site-health.js, if you want to look at it that way. It fills missing meta descriptions, image alt text, format tags. Crucially, it only ever adds missing metadata; it never overwrites content that's already there. It caps how many fixes it runs per session, throttles to once a day, and prioritizes high-traffic pages first. Boring, responsible, exactly right.

Here's where it gets useful.

At some point before I had the domain handling locked down, the internal linker was pointing at the wrong domain variant. Not a totally wrong URL — just the wrong form of the right URL. Think thedaringcreatives.com versus www.thedaringcreatives.com — a canonical URL handling problem, basically.

The result: four broken internal links shipped in a published revision before I caught them.

Four. In one pass. Which is the kind of thing that's embarrassing but also clarifying, because it forces you to ask: where exactly is the domain being set, and how many places does it live?

The answer, before the fix, was: too many.

One constant. That's it.

The fix is almost annoyingly simple.

There's now a single constant — SITE_BASE_URL in the linker file — that is the one and only place the domain is defined. Every internal link the system builds is constructed as ${SITE_BASE_URL}/${slug}/. Nothing else in the pipeline inlines the domain directly. The project rule is now "never inline the domain anywhere."

What this means practically: if the domain ever changes — migration, rebranding, whatever — it's one edit. Not thirty. Not "find and replace and hope you got them all." One line.

I understand that for seasoned developers, this stuff is a no brainer. But assuming you aren't a seasoned developer who is reading this? Not as obvious.

I find this kind of solution satisfying in a way that's hard to explain. It's not clever. It doesn't require a new tool or a new model. It's just the discipline of centralizing the thing that was scattered. The bug existed because the domain was being assumed in multiple places, and assumptions compound.

Why this is worth thinking about if you're building anything automated

Most creators and freelancers I talk to are somewhere on the spectrum of AI use — from "I use ChatGPT to help draft things" to "I'm starting to string tools together into something that runs on its own." I'm somewhere in the middle of that, honestly. I don't think I'm operating at a level that's above most of the people reading this.

But the broken-links bug is a good example of something that shows up at every level of automation: the more a system runs without you watching, the more expensive scattered assumptions become.

When you're doing everything manually, a wrong domain is something you notice immediately. You paste the link, you see it's wrong, you fix it. When a system is generating and inserting links on its own, that same wrong assumption can propagate across multiple articles before you catch it. The blast radius of a bad assumption scales with how automated the system is.

The answer isn't to automate less. If you're building an AI workflow, the move is to find the assumptions and centralize them. One constant. One index. One source of truth per thing that matters.

The index rebuild is the other piece worth stealing

The canonical link index — the JSON file that lists every real published post — rebuilds immediately after a publish. Same-day links work because of this. If I publish a post at 10am and the optimizer runs on a new draft at noon, the noon draft can link to the 10am post.

That sounds like a minor detail. It's actually the thing that makes the whole linking system trustworthy.

If the index was stale — rebuilt once a week, say, or manually — you'd end up with a linker that either misses recent posts or, worse, tries to link to posts it thinks exist based on an outdated list. The freshness of the index is what keeps the linker honest.

For anyone building something similar: whatever your "source of truth" is for your content, your tools, your client list, your product catalog — the update frequency of that source matters as much as its existence. A source of truth that's six weeks out of date is just a different kind of wrong assumption.

What this series is actually about

I want to be clear about why I'm writing these deep dives, because it's not to show off a system that works perfectly. It shipped four broken links. There are probably other bugs I haven't found yet.

The reason I'm documenting this stuff is that most of the writing about AI automation for small operators is either very high-level ("AI can help your business!") or very technical in a way that assumes you're already a developer. There's not much in the middle for someone who's building something real, running into real problems, and figuring it out as they go.

I'm in that middle. And the broken-links bug is more useful to you than a polished success story, because it shows the actual shape of the problem and the actual shape of the fix.

The optimizer is unglamorous. It's link weaving and metadata filling and domain constants. But it's also where the system either earns trust or loses it — because broken links on a published post are visible to readers in a way that a slightly off meta description isn't. The unglamorous parts are often the load-bearing ones.

If you want the full picture of how this pipeline is structured, the hub article is the place to start. The other deep dives in this series get into the research and drafting side, and the scheduler — which has its own story worth telling separately.

// LEXICON_CITY_DISPATCH_REQ
// STATUS: CONNECTION_STABLE
// SOURCE: CENTRAL_DISPATCH_HQ

SHERMAN UPLINK: "I'm at HQ holding down Central Dispatch. Enter your query below to pull relevant data records and I'll see what data cards we've recovered!"