Why "ChatGPT for Tax Research" Is the Wrong Frame

A persistent meme inside accounting firms is that ChatGPT is the modern equivalent of a tax research subscription - a faster, cheaper way to get to the answer on a tricky technical point. The framing is intuitive, but on close inspection it is wrong on three different levels at once. The model is not designed for the task, the failure mode is unusually dangerous in tax work, and there are better-suited alternatives that are not significantly more expensive.

This piece argues that the right frame is not "ChatGPT for tax research" but "AI for the structure around tax research" - the framing of the question, the production of the research note skeleton, the summarisation of the source materials the researcher has already retrieved. The technical answer itself remains the work of a qualified human reading primary sources. The distinction is not pedantic; it is the difference between a useful productivity gain and a foreseeable claim.

What "tax research" actually is

A tax research exercise in a UK accounting firm typically involves identifying the legislative provision that applies (a section of CTA 2009, ITTOIA 2005, VATA 1994, ITEPA 2003, TCGA 1992, or one of the consolidating acts), reading the HMRC manual guidance on that provision, considering any leading cases, working through the facts of the client's position, and producing a written note that documents the analysis and the conclusion. For UAE work the comparable cycle covers the Corporate Tax Law, the implementing decisions, FTA public clarifications, and the relevant ministerial decisions.

The output is a document that may end up being relied on by the client, scrutinised by a regulator, or produced in litigation. It needs to be defensible against the question "where did you get that?" with a citation to a primary source the questioner can independently verify.

What ChatGPT actually is

ChatGPT and its consumer-grade peers are general-purpose language models trained on a broad corpus that includes some tax material but is not curated, current, or jurisdiction-specific. The model produces plausible-sounding answers from statistical patterns in its training data. It is not connected by default to HMRC's current legislation pages, the up-to-date manual, the latest case law, or the FTA's recent public clarifications. It does not know what was amended in the last Finance Act unless that information happened to be in its training data and survived the model's tendency to compress.

This matters because tax law changes constantly. A model trained even six months ago may give a confident answer that has since been overtaken by amendment, by a tribunal decision, or by a published HMRC change of practice. The model will not flag this; it will simply give the older answer with full confidence.

The dangerous failure mode

The most studied failure mode of consumer chatbots in legal and tax queries is the fabricated authority. The model produces a confident-sounding answer that cites a section number, a case name, or an HMRC manual paragraph that does not exist. Or it cites a real provision but for a proposition the provision does not support. Or it cites an authority that has been superseded.

A reader who is not a tax specialist cannot tell the difference between a real and a fabricated citation without going to check. A reader who is a tax specialist usually catches it but has now wasted time checking, which defeats the productivity claim. And in either case the answer was offered with no warning that it might be wrong - the model does not know what it does not know.

Why this matters for a working paper file

A tax research note that ends up on the working paper file needs to be defensible. If the file shows that the conclusion was reached by asking a general-purpose chatbot a question and copying the answer, the firm has no defence to a regulator who asks where the analysis came from. The chatbot is not a source; it is a stylistic veneer over an unknown collection of training data.

This is not a hypothetical concern. The HMRC Tax Tribunal and the various professional bodies have already begun to encounter cases where AI-generated analysis with hallucinated authorities has been put forward in support of a position. The reputational cost to the practitioner is meaningful.

The right frame - AI for the structure

The framing that works is to use AI for the structure around the research, while the research itself remains the work of a qualified person reading primary sources. Concretely:

AI lists the provisions and manuals that are likely to be relevant - the researcher then reads them at source
AI produces the skeleton of the research note (facts, issues, applicable law, analysis, conclusion) - the researcher fills each section using primary sources
AI summarises a document the researcher has retrieved themselves (a manual page, a tribunal decision) - the researcher confirms the summary by reading the source
AI surfaces the questions the researcher should ask the client before the analysis is complete - the researcher then asks them
AI drafts the covering email to the client once the analysis is signed off - the partner reviews and sends

Where docs-only RAG changes the picture

A separate question is whether a docs-only AI tool, where the model has only the documents you uploaded, can be used for tax research. The honest answer is that it depends on what you upload. If the documents are the firm's own technical library - the curated set of HMRC manuals, the legislation, the firm's previous research notes - and the tool is forced to answer only from those documents with citations, the failure mode is much narrower. The model can still mis-interpret a passage, but it cannot fabricate an authority.

Even then, the output is best treated as a first-pass summary that the researcher verifies against the source, not as the answer. The discipline of going to the primary source remains. What changes is that the AI has saved fifteen minutes of locating the right passage.

The professional duty point

The ICAEW Code of Ethics, the ACCA Rulebook, and the UAE auditor regime all require members to exercise professional competence and due care. Reliance on a tool that is known to fabricate authorities, without independent verification, is unlikely to meet that standard if the matter goes wrong. A partner who can show that the firm uses AI as part of a structured process - with a written policy, an approved tool, source citation, and human review - is in a defensible position. A partner who can only show that someone asked ChatGPT is not.

A note on cost

The cost argument for consumer chatbots over proper tax research subscriptions is weaker than it first appears. The marginal cost of a single research session in a proper subscription tool is small relative to the time it saves and the risk it avoids. The cost of a single wrongly advised position - in fees written off, professional indemnity exposure, and reputational damage - is many multiples of the annual subscription. The cost case for the consumer chatbot is really an attention cost, not a money cost; it feels easier to type a question into a chat box than to navigate a subscription tool. That is a workflow problem to solve, not a reason to take the risk.

How Accupe helps

Accupe's docs-only AI mode is built for exactly this kind of structured use. Upload the firm's technical library, the client's facts, and the relevant correspondence; ask the question in Planning mode for a structured note skeleton or in Ultra-Detailed mode for a more thorough first-pass analysis; each output cites its source passage. The tool refuses cleanly when the answer is not in the documents, which is the right behaviour for a research task that has to end with a citation. Per-firm pricing from £20/month.