Where AI Earns Its Keep in Professional Services, and Where It Quietly Fails

There are two ways to be wrong about AI in professional services, and almost every firm is wrong in one of them. The first is to treat AI as a discontinuity — to assume it is about to remake the profession, displace the practitioners, and reward the firms that bet aggressively on rebuilding themselves around it. The second is to treat AI as a fad — to assume it is hype, that the existing way of doing things will reassert itself, and that any investment in it is a tax on a profession that has worked fine for a hundred years. Both views are reassuring in their certainty. Both are wrong. The reality is messier and more interesting, and getting it right requires resisting the urge to be certain about something that is still in motion.

Every conversation about AI in professional services eventually arrives at the same set of questions. Will AI replace attorneys, accountants, bookkeepers? Will small firms lose to large firms with better technology? Will technology-first competitors disrupt incumbents? These are the wrong questions, or at least they are the wrong first questions. The right first question is far more boring: where, specifically, in the work that this firm does every day, can AI make the work better?

The honest answer for most small firms today is “a few places, narrowly, with careful supervision.” That is less exciting than the broader claims, but it is what we actually see when we deploy these tools inside our firms. The places where AI works are surprising. The places where it does not are also surprising. The difference between the two has almost nothing to do with the underlying model and almost everything to do with the structure of the work — which is the part that gets the least attention in the AI discourse and that, in our experience, matters the most.

Where AI Earns Its Keep Today

Document review. Not the final review by an attorney, but the first-pass triage. Finding the relevant clauses in a hundred-page contract, surfacing the unusual provisions, comparing against a known good template. The attorney still does the legal judgment, but she does it on a curated and annotated document instead of a raw one. The time savings are real. The accuracy improvement is also real — the AI does not get tired on page sixty, the way a human reviewer does, which means the unusual clause that hides on page sixty-two no longer gets missed.

Drafting. Standard letters, standard motions, standard engagement letters, standard responses to common client questions. The output is never publishable as-is, but it is far better than starting from a blank page. The skill is in writing the prompt correctly and in editing the output rigorously. The associate who knows how to do both does the same work in half the time. The skill of editing AI output is, importantly, not the same as the skill of drafting from scratch. It requires a different cognitive posture — a critical, suspicious, line-by-line read rather than a generative one. Firms that train their associates explicitly in this skill get more out of AI drafting than firms that simply hand the tools to the team and hope.

Research. Tax research, case research, regulatory research. AI search is good at finding the relevant authority. It is not yet good at synthesizing the authority into a defensible answer. So we use it to find what to read, not to decide what to do. The distinction is operational: AI is a research assistant, not a research conclusion. The firm that treats it as a research conclusion will eventually issue an opinion that is wrong, lose a client over it, and discover that AI hallucination is not a theoretical risk — it is a malpractice risk hiding inside a productivity tool.

Bookkeeping categorization. The marginal AI improvement here is enormous because the work is repetitive, the categories are well-defined, and the corrections are easy to learn from. The bookkeeper goes from coding every transaction to reviewing the AI’s codings. Throughput doubles. Accuracy goes up. This is the canonical example of AI fitting the structure of the work — a high-volume, well-defined, correctable task with clear feedback loops. Where the structure of the work matches the strength of the model, the value is unambiguous. Where the structure does not match, no amount of model improvement helps.

Where AI Quietly Fails

Anything that requires the model to understand who the client actually is, what they actually want, and what they have actually agreed to. AI does not know your client. It cannot tell you whether the answer that is technically correct is also the answer your client should hear, in the way your client should hear it, given the relationship you have with them.

Anything that involves novel judgment. The first time a fact pattern looks like X but is actually Y, AI will get it wrong, because it is averaging across cases it has seen. The exceptions are where the practitioner earns her living. AI cannot replace the practitioner there and probably should not try.

Anything that introduces material risk. We do not let AI send anything to clients without human review. We do not let AI sign anything. We do not let AI make decisions that we would not let a first-year associate make on her own. The standard is the same one we have always used for first-year work: useful, but always reviewed.

The “quiet failure” framing in this section’s title is deliberate. AI does not usually fail loudly. It fails by producing output that looks plausible, that is wrong in ways that are subtle, and that requires a knowledgeable reviewer to catch. The firms that get hurt by AI are not the firms whose AI tools crashed. They are the firms whose AI tools worked just well enough to be trusted by people who did not have the skill to verify the output. The reviewer-skill problem is the actual problem. The model-quality problem is a secondary one, and it is one the vendors will solve faster than the reviewer-skill problem will be solved. The firms that invest in their reviewers’ AI literacy are the firms that will use AI well over the next decade. The firms that invest only in tools are buying half the answer.

The Deterministic Layer Is the Point

We have written elsewhere about the line between deterministic systems and nondeterministic ones. AI is nondeterministic by nature. The work in our firms is mostly deterministic by nature — the same kinds of matters, the same kinds of documents, the same kinds of decisions, with the same kinds of safeguards. The way we use AI is to put nondeterministic steps inside deterministic workflows, with deterministic checks on the output. This is unglamorous and it is also what works.

A modern tax controversy practice looks the same as it did five years ago from the client’s perspective. The forms are the same. The deadlines are the same. The IRS is the same. What has changed is that the steps in between — pulling transcripts, summarizing notices, drafting responses, calculating projections — happen faster and with fewer errors. The practitioner spends more time on the substantive judgment and less time on the mechanical work. That is the entire promise of AI in this kind of practice, and it is enough.

The architectural insight here is worth stating explicitly. The job of the firm is to deliver deterministic outcomes — the right legal advice, the right return, the right book of accounts. The job of the workflow is to deliver those outcomes reliably. AI is a tool that can do some of the intermediate steps faster, but it cannot be allowed to compromise the determinism of the outcome. So we wrap the nondeterministic AI steps in deterministic scaffolding: a structured input, a known-good template, a human reviewer, a checklist verification. The scaffolding is the firm’s promise to the client. The AI is the productivity multiplier inside the scaffolding. Firms that get this layering right move faster without losing reliability. Firms that get it wrong move faster and lose reliability at the same time, and the loss of reliability is not visible until the first time it matters, at which point it is too late.

The Talent Implication Most Firms Miss

The popular narrative is that AI will reduce the need for junior associates. The narrative is partially right and mostly misleading. The mechanical work that junior associates used to do — first-pass document review, basic research, template drafting — is exactly the work AI is best at. So firms will indeed need fewer hours of that work from juniors. But the firms that will thrive are the ones that take the time they used to spend supervising juniors on mechanical work and reinvest it in training juniors on the judgment work that AI cannot do. The output is the same headcount, but a different developmental curve — juniors who are doing harder, more cognitively demanding work earlier in their careers, and reaching senior judgment maturity faster than their predecessors did.

The firms that get this wrong will hollow out their talent pipeline. They will keep the same supervision model — juniors doing mechanical work, seniors reviewing it — but with AI in the middle, which means the juniors are not actually doing the mechanical work, which means they are not building the muscle that the mechanical work used to build. Five years later, those firms will have senior associates who have never had to read a hundred-page contract from cover to cover, and who therefore cannot reliably catch the things that the AI missed. The talent risk of AI is not that it will replace the juniors. The talent risk is that it will produce a generation of seniors who never developed the underlying skill that AI is now imperfectly performing. The cure is to be deliberate about what juniors do learn, given what they no longer have to do.

What We Are Building Toward

Over the next several years we expect AI to keep moving from optional to assumed inside the firms we own. The associates we hire will use it because it makes their work better. The clients will benefit because the work will be faster, cheaper, and more accurate. The competitive advantage will accrue to firms that integrate AI carefully into their existing workflows, not to firms that try to rebuild themselves around it. Quiet integration beats loud rebranding every time.

The firms that will struggle are not the ones that are slow to adopt AI. They are the ones whose underlying processes were so undocumented and so ad-hoc that they cannot tell where AI would fit. The pre-condition for using AI well is having a real process to begin with. That has always been the pre-condition for everything else in a professional services firm, too. AI is not a way to skip the work of building a real firm. It is a multiplier that rewards firms that have already done the work. The multiplier on zero is still zero, and a lot of firms are about to discover that their AI investments are multiplying the wrong thing.

What to Do Monday Morning

Pick three tasks in your firm that are repetitive, well-defined, and currently consume meaningful associate time. Document those tasks. Then pilot AI on them, with explicit human review at every step. Measure the time savings, the accuracy delta, and the reviewer experience. The pilot is the data. Once you have the data, you can decide whether to expand the use of AI on that task — and whether to expand it to other tasks of similar shape.

Resist the temptation to deploy AI across the firm at once. The deployment that scales is the one that is preceded by a documented process and followed by measured results. The deployment that fails is the one that is announced before it is tested. There is enormous pressure inside firms right now to be seen to be using AI. The pressure is mostly cultural, not commercial, and it is causing firms to make commitments faster than their actual experience with the tools supports.

And finally, decide explicitly what you will not let AI do. Write it down. Tell the team. The list is as important as the list of what you will let AI do, because the boundary is what protects the firm from the quiet failure mode. Firms with clear lines about what AI does and does not do produce reliable work with AI in the mix. Firms with fuzzy lines produce variable work and eventually a malpractice claim. The clear-line firm is the durable firm, and the durability is the point.