9 Documents Accountants Should Never Paste Into a Public LLM

Most accountants now understand in principle that pasting client material into a consumer chatbot is risky. In practice the line between "general question" and "client material" gets crossed more often than partners realise, and the categories most likely to cause a regulatory problem are not always the obvious ones. A clear list of the documents that must not go into a public large language model under any circumstances is the most useful single piece of governance a firm can put in front of its team.

This piece sets out the nine document categories that should be on every firm's prohibited list, with the specific regulatory exposure each one carries in the UK and UAE. It is not a substitute for a full AI policy, but it is the practical filter every team member can apply in real time before pasting anything.

1. AML/KYC source documents

Identity documents, proof of address, beneficial ownership information, source of funds evidence, and PEP screening results are the highest-risk category. They contain special-category personal data, biometric content (in the case of identity documents with photographs), and information that the firm holds under specific regulatory obligations under the UK Money Laundering Regulations 2017 and the UAE's Cabinet Decision No. 10 of 2019.

Pasting any of this into a consumer chatbot is, on the standard analysis, an unlawful disclosure under data protection law, a breach of the firm's AML record-keeping obligations, and potentially a separate breach of confidentiality. The exposure is very high; the productivity gain is essentially zero because there is no useful question a consumer chatbot can answer about an identity document anyway.

2. Engagement letters and client contracts

Engagement letters identify the client, the scope of the work, the fee, and often commercially sensitive terms negotiated specifically with that client. Pasting an engagement letter into a public chatbot, even to ask "rewrite this clause" or "summarise this", is a confidentiality breach that the firm has no contractual cover for.

The right pattern is to use a docs-only AI tool inside the firm's platform for engagement letter work, or to redact the client-identifying material before using a non-client question with a consumer tool. The "redact and paste" pattern is workable but requires discipline that the team will not always apply under time pressure.

3. Client tax returns and computations

A corporation tax computation, a personal tax return, a partnership return, or a UAE Corporate Tax computation contains the client's entire financial position. Pasting it into a consumer chatbot for "a sense check" or "to spot anything obvious" is one of the most common ways firms end up in regulatory difficulty.

The data is subject to client confidentiality, taxpayer confidentiality, and data protection obligations. The chatbot vendor is not a processor under any agreement the firm holds; the data may be retained and may be used to train future models depending on the tier. This is exactly the use case for the firm's approved docs-only AI tool, not for a consumer product.

4. Statutory accounts and management accounts

Statutory accounts that have not yet been filed at Companies House are confidential to the client. Management accounts are confidential at all times. Pasting either into a consumer chatbot - for example, to ask the model to draft a commentary or to compute variances - is a disclosure that the client has not authorised and that the firm has no defence for.

The exposure is compounded by the fact that the numbers are often material non-public information for any client that is part of a larger group or that has external investors or lenders. The downside scenario includes regulatory action and contractual damages, not just data protection enforcement.

5. Payroll data

Payroll registers, payslip data, PAYE submissions, and pension contribution records contain personal data of the client's employees, who are individuals from whose perspective the firm is processing under the client's authority. Pasting any of this into a consumer chatbot risks breaching the firm's processor agreement with the client, the data subjects' rights under UK GDPR or the UAE personal data protection framework, and the firm's professional confidentiality obligations.

There is also a practical risk that does not always get attention: payroll data often contains identifiers (national insurance numbers, dates of birth, salary information) that uniquely identify individuals and that cannot be effectively anonymised before pasting.

6. Client correspondence and email chains

A six-week email chain with a client contains the client's name, sometimes other parties' names, the substance of the engagement, often financial figures, and frequently information the client shared in confidence. Pasting the chain into a consumer chatbot to ask for a summary is a disclosure of all of it.

This is one of the most common shadow uses we see in firms, because the chain feels like "context" rather than "client material". It is client material in every meaningful sense. The right pattern is the docs-only AI tool with the chain uploaded as a document, not the consumer chatbot with the chain pasted as text.

7. Internal working papers

Audit working papers, tax research notes that identify the client, file notes from client meetings, and partner review notes are confidential firm material that often contains the firm's own preliminary judgements on the client's position. Pasting them into a consumer chatbot exposes both the client's information and the firm's thinking.

The risk of exposing the firm's thinking is sometimes the more commercially significant one. A judgement that was clearly marked as preliminary inside a working paper becomes, once it has left the firm, just an opinion attributed to the firm, with all the reputational consequences that follow if it turns out to be wrong.

8. Director, shareholder, and beneficial owner schedules

Lists of directors, shareholders, PSCs, ultimate beneficial owners, and trust beneficiaries are personal data in their own right and are often combined with information about holdings, addresses, and financial interests that increases the sensitivity. Pasting them into a consumer chatbot to ask for a structure diagram or a summary is an unlawful disclosure of personal data and a breach of the trust under which the firm holds the information.

For UK companies this information is partly public via the Companies House register, which sometimes leads team members to assume it is fine to share - but the version held by the firm typically contains more than the public register, including residential addresses redacted from the public file. Treat the firm's version as the sensitive one.

9. Anything covered by a non-disclosure agreement

Any document the firm holds under an NDA - typically transaction documents in M&A work, draft commercial agreements, sensitive strategy papers, restructuring plans - is contractually prohibited from being disclosed to a third party, which the chatbot vendor is. Pasting any of this into a consumer chatbot is a breach of contract, separately from any data protection or confidentiality issue.

This category often catches the most experienced team members out, because the documents look like business material rather than personal data. The NDA is what matters; the format of the document is irrelevant.

Why "redacting first" is not as safe as it sounds

A common workaround is to paste a document into a consumer chatbot after replacing the client name with "ABC Ltd" and the figures with placeholders. This reduces the risk but does not eliminate it, for two reasons. First, real-world redaction is hard to do completely - secondary identifiers (counterparty names, specific dates, regional details, sector specifics) often survive and can re-identify the client. Second, even properly redacted, the document is still being shared with a third-party vendor that has no processor agreement, which is a position regulators are increasingly unwilling to accept.

The pattern that works is to use the firm's approved docs-only tool for any real client work, and to reserve consumer chatbots for genuinely generic questions (no client identifiers, no client-specific facts, no specific transaction details). The redact-and-paste pattern should be the rare exception, not the default.

The working filter the team can apply in real time

A simple test before pasting anything into a consumer chatbot: would I be comfortable showing this content to a stranger in a café? If the answer is no for any reason - client confidentiality, personal data, NDA, commercial sensitivity - the answer is also no for the chatbot, and the work should go through the firm's approved tool. The test is crude but it catches the vast majority of risk cases.

How Accupe helps

Accupe's docs-only AI mode gives the team a safe place to do exactly the work the chatbot is tempting them to use. Upload the engagement letter, the tax computation, the email chain, or the working paper; ask the question in Fast, Planning, or Ultra-Detailed mode; the answer is grounded in the uploaded documents with source citations, and the data does not leave the firm's processing boundary. The AI sits alongside AML/KYC screening via OpenSanctions, the encrypted client portal, and built-in e-signatures inside one practice management platform. Per-firm pricing from £20/month.