Deep Work for Programmers in the Age of AI

There is a particular kind of afternoon every working programmer knows. You sit down at nine with a clear problem and a clean head. By noon you have touched eleven browser tabs, three Slack threads, a model that wrote half a function for you, a half-finished refactor, and a bug that is somehow worse than the one you started with. You were busy the entire time. You were rarely working.

The folklore of our profession blames this on time — too many meetings, not enough hours, the wrong sprint length. That diagnosis is comfortable because time is somebody else's fault. But time is not the resource that ran out. Attention is. And unlike time, attention is something you partly govern, something the modern internet is engineered to take from you, and something the current generation of AI tooling quietly teaches you to spend in the worst possible way.

This is an essay about treating attention as a budget: a finite, depletable, defensible resource that deserves the same engineering rigor you would give to a memory allocator or a request-per-second limit.

Attention is not time, and the difference is the whole point

Start with the unit error. We schedule our days in hours because hours are easy to put on a calendar, but the work of building software does not happen in hours. It happens in the state of holding a system in your head.

There is a useful, slightly cheating analogy here, and it is close enough to the truth to keep. Your working memory behaves like a very small, very fast cache sitting on top of a slow, enormous store. When you are deep in a problem, the relevant pieces — the call graph, the invariant you are trying not to break, the three edge cases you noticed twenty minutes ago — are all resident in that cache. Reads are instant. You make connections you could not have made cold. This is the state in which real engineering gets done.

An interruption is a cache flush. It does not matter that the Slack message took eleven seconds to read. The cost is not the eleven seconds; the cost is the cold reload afterward — the slow, irritating process of pulling all of that working set back into fast memory, hoping you did not lose the one thread that was about to pay off. Often you did lose it. Anyone who has been pulled out of a hard debugging session for a "quick question" knows that the question was not quick. It cost you the next forty minutes, most of which you spent rebuilding what you already had.

So the right way to budget a programmer's day is not in hours but in uninterrupted blocks long enough for the cache to fill and stay warm. Two ninety-minute blocks in which nothing flushes the cache are worth more than eight hours sliced into fifteen-minute confetti. This is not a productivity-influencer flourish; it is the most consequential scheduling fact about our work, and almost every workplace ignores it.

The web you build is the web that is eating you

Here is the uncomfortable part, and the reason this topic belongs on an engineering blog rather than a wellness one.

The fragmentation of your attention is not an accident or a personal failing. It is a deliverable. Somewhere, a team of competent engineers shipped the infinite scroll you cannot stop, the variable-reward notification that arrives at exactly the wrong cadence, the autoplay that removes the moment of decision, the red badge whose only job is to create a small open loop in your mind that you will close by tapping. These are not bugs. They are the product, and they were built by people who do the same work you do.

The mechanism is old and well understood: intermittent, unpredictable rewards are the most powerful schedule of reinforcement we know of, which is why a slot machine and a social feed feel, at the level of the nervous system, like cousins. The feed does not need to be enjoyable. It needs only to be uncertain — the next swipe might be the good one — and the loop sustains itself. Online entertainment, in its current engagement-optimized form, is a machine for converting your attention into someone else's revenue, and it is extraordinarily good at its job.

That behavioural architecture now exists everywhere across the modern browser economy — from social platforms and streaming ecosystems to gamified productivity tools and interactive entertainment environments like Spinboss, all competing for the same finite cognitive resource: sustained human attention.

You do not get to opt out of understanding this by being technical. If anything, knowing how the trick works grants a false confidence. The engineer who built the dopamine loop is no more immune to the next platform's dopamine loop than a locksmith is immune to being locked out. The defense is not understanding; it is architecture. You do not out-discipline a system designed by hundreds of people to defeat your discipline. You change the environment so the loop never starts.

What AI did to the shape of the work

The arrival of capable coding models added a second, subtler drain — and this one is harder to see because it disguises itself as productivity.

Used well, an AI assistant is a genuine accelerant: it writes the boilerplate you have written a thousand times, recalls an API signature you half-remember, drafts the test scaffold, explains an unfamiliar error. None of that is in dispute, and pretending otherwise is nostalgia, not rigor.

But notice what it does to the shape of your attention. The old loop was: hold the problem, think, write, run, observe, think again. Long arcs of sustained concentration, punctuated by feedback. The new loop is: describe, wait, skim, accept or reject, describe again. It is fast, it is shallow, and it is addictive in precisely the same way the feed is addictive — a variable reward on a short cycle. Sometimes the model nails it. Sometimes it produces confident garbage. You never quite know which, so you keep pulling the lever.

The failure mode is specific and worth naming. When you generate code instead of writing it, you can ship something that works without ever building the mental model of why it works. The cache never filled, because you outsourced the part of the job that fills it. The function passes its tests, the PR merges, and three weeks later — when the edge case shows up in production — nobody on the team holds the model in their head, including the person who "wrote" it. You have traded a small amount of present effort for a large amount of future confusion, and confusion is the most expensive thing in software.

This is the AI-era version of an old craftsmanship principle. The reason senior engineers were always told to read the code they copied from Stack Overflow was never about attribution. It was that pasted code you do not understand is a liability with a delay fuse. The model has not changed the principle. It has only made the paste button irresistible and the volume enormous.

A working discipline, not twenty tips

What follows is not a listicle, because a list of twenty tips is itself a fragmented-attention artifact — something to skim and feel productive about without changing anything. These are a small number of rules, chosen because they are load-bearing. Adopt one fully rather than all of them weakly.

Protect one long block before the day's entropy arrives. For most people the first block after they sit down is the only one the world has not yet contaminated. Spend it on the hardest thing on your plate, before the meetings and messages start flushing your cache. Not email. Not "warming up" by triaging tickets. The hard thing, cold, first. If you do nothing else here, do this.

Treat every notification as an interrupt with a cost model. An operating system does not service interrupts the instant they fire; it masks them, batches them, and handles them on its own schedule. Do the same. Notifications off by default. Messages checked on a cadence you set — twice a morning, say — not on the cadence the sender chose for you. The objection is always "but what if it's urgent?" Genuinely urgent things have a second channel, and almost nothing is genuinely urgent. The feeling of urgency is, more often than not, the open loop doing its job.

One screen, one problem. The single most effective environmental change is brutally low-tech: close the tabs. Not minimize — close. An open tab is an open loop, and a screen full of them is a screen full of invitations to flush your cache. The browser is where most engineers lose the afternoon, and it loses it one "quick check" at a time.

Batch the model; do not converse with it. Decide what you want, write it down as you would a small spec, and ask once. Read the whole output as code, the way you would review a colleague's PR — looking for the misunderstanding, not skimming for green. Then close the chat. The trap is the open-ended back-and-forth, the lever-pulling. A model is a power tool, and you do not idle a circular saw and wave it around while you think about your next cut.

Anchor on the next failing test. When you do get pulled out — and you will — the cheapest way back into the cache is a concrete, mechanical re-entry point. "What is the next thing that should fail, and why?" is a better anchor than "where was I?" because it forces the working set back into memory rather than letting you drift into triage. Leave yourself that anchor before you step away, like a // TODO: this assertion should now pass at the exact line you were thinking about.

Read source code that nobody is paying you to read. This is the long game, and it is the same advice this site has given before because it remains the best advice there is. Thirty minutes a week inside a real system you did not write — a database, an interpreter, a window manager — is sustained-attention training disguised as learning. It rebuilds the exact muscle the feed and the prompt loop atrophy: the patience to let a structure reveal itself slowly. You cannot buy that muscle and a model cannot lend it to you.

Refusing to be confused, applied to yourself

The strongest engineers share a quality that is easy to recognize and hard to name: they refuse to be confused. When reality and expectation diverge, they treat the gap as the most interesting thing in the room and they do not look away until it closes.

The argument of this essay is simply that your own attention deserves the same refusal. When you finish a day that felt busy and produced nothing, that is a gap between expectation and reality, and the correct response is not to absorb it silently and blame the calendar. It is to treat the lost afternoon as a bug — to ask what flushed the cache, what loop you got caught in, which lever you kept pulling — and then to change the architecture so it cannot happen the same way tomorrow.

Time will keep arriving whether you manage it or not; the calendar fills itself. Attention is the part you actually own, the part the whole rest of the internet is competing to spend on your behalf, and the part on which every piece of real work you will ever ship quietly depends. Budget it like it is scarce, because it is the scarcest thing you have.

Frequently asked questions

Is "deep work" just a productivity trend, or does it matter specifically for programmers?

It matters more for programmers than for almost any other profession, because the core of the work — holding a system's structure and invariants in working memory — collapses the moment that working set is flushed by an interruption. The cost of a context switch in software is not the seconds the interruption took, but the slow reload of everything you had assembled. That makes long, uninterrupted blocks the single most valuable scheduling unit for engineers, regardless of what the surrounding methodology is called.

Does using AI coding tools hurt focus, or help it?

Both, depending on how you use them. As a power tool — invoked deliberately, given a clear spec, its output reviewed as code — an assistant removes drudgery and protects your attention for the parts that need it. As an open-ended chat you keep pulling on for variable rewards, it fragments concentration in the same way an infinite feed does, and it can let you ship code whose underlying model you never actually built. The discipline is to batch your requests and read the output critically rather than converse with the tool continuously.

Why is online entertainment so hard to resist even when you understand how it works?

Because the engagement loop targets a mechanism — intermittent, unpredictable rewards — that understanding does not switch off. Knowing that a feed uses variable reinforcement does not make you immune to variable reinforcement, just as a locksmith is not immune to being locked out. The effective defense is environmental rather than cognitive: remove the loops from your working environment so they never start, instead of relying on willpower to win a contest engineered to be unwinnable.

What is the single most effective change a developer can make?

Protect one long, uninterrupted block at the start of the day and spend it on the hardest problem before meetings and messages begin. Most people's first block is the only stretch the day has not yet contaminated, and using it for triage or email wastes the one period when sustained attention is freely available.

How do I get back into a hard problem after I have been interrupted?

Re-enter through a concrete, mechanical anchor rather than a vague "where was I?" The cheapest anchor is a note about the next thing that should fail and why — ideally left at the exact line you were thinking about before you stepped away. Phrasing the re-entry as a specific failing test forces your working set back into memory instead of letting you drift into shallow triage.