
Last month, the UK’s Financial Reporting Council (FRC) published a 60-page document titled "Generative and Agentic AI Guidance: Risks, mitigations and illustrative examples." It is formally addressed to the central technical teams within audit firms, i.e., the people building, procuring, and deploying AI tools for use in engagements. But if you're a senior auditor or audit manager whose firm has already started using these tools, or is weighing up whether to, this guidance sets the operational standard you will eventually be held to. Understanding what it actually says is not optional.
Something worth stating clearly is that the guidance wasn’t developed because something had gone wrong. The profession's instinct is to treat any new regulatory publication as a response to a problem, but this one isn't. It's an attempt to define what good looks like before the problems emerge. That makes it more useful, and more demanding, than a standard inspection finding.
The FRC's framework works in two layers. The first identifies three categories of risk: deficient AI output, misuse of that output, and methodology that doesn't comply with auditing standards. The second identifies four categories of mitigation that firms should use to address those risks. Working through the mitigations tells you quite a lot about where most firms currently stand.
The first mitigation asks whether the AI tools being used in the audit were designed and developed in a way that supports audit quality. This isn't a question most auditors have historically been required to answer. The tool arrives, the training happens, the tool gets used.
But the FRC is asking firms to take a position on whether the system itself is fit for audit purposes, not just whether a particular output looks right. That requires either meaningful oversight of third-party tools being procured or in-house development processes where this question is built in from the start.
For most firms using third-party AI tools, this means procurement and quality control conversations that haven't typically been had. What is the training data? What are the documented failure modes? What testing has been done against audit-specific tasks? These are not questions vendors will answer unprompted.
The FRC questions whether companies can demonstrate that the AI systems they use meet a defined quality bar. This is where the guidance starts to feel genuinely unfamiliar. Auditors are used to certifying their own work. Certifying the tools they used to do that work is a different kind of accountability.
The practical implication: firms need documentation not just of what the tool did on a particular engagement, but of whether the tool is appropriate for audit use at all. For most firms, that file doesn't exist yet.
This one looks straightforward on the surface. The message? Train your people. Have a governance framework. Most firms have done something in this area, even if informally.
But the FRC's version is more demanding than a lunchtime session and a one-pager of dos and don'ts. The question being asked is whether auditors using AI tools genuinely understand their limitations, including the non-obvious ones; like the tendency to produce confident-sounding but subtly incorrect outputs in technical domains.
The guidance uses the example of a tool performing contract review for revenue recognition risk. An auditor who knows that AI can hallucinate is not the same as an auditor who knows when and why it hallucinates in the specific task of identifying performance obligations across a large contract portfolio. The FRC is asking for the latter.
This is the category that tends to generate the most nods in the room. “Yes, of course, there's a human involved. Yes, of course, we review the output”. But the guidance asks something more specific: is that review meaningful or nominal?
Consider the summarisation of board minutes, another use case that the FRC explicitly cites. The path of least resistance is to let the AI summary set the agenda: if the tool's output looks clean, the auditor moves on. The FRC's point is that this is not the same as the auditor independently considering what matters in those minutes and using the AI as a supporting tool.
Professional skepticism has always required auditors to maintain an independent mindset when assessing management assertions. The guidance extends that principle to AI outputs. If you would not accept a claim from management without testing it, the same should apply to a model's conclusion.
All four categories sit under one principle: the FRC states clearly that technology changes, accountability doesn't. Firms and the individuals within them remain accountable for audit conclusions, regardless of what tools were used to reach them.
And the FRC is careful about how it frames the four mitigations. They are not mandates with fixed requirements. The guidance says explicitly that "the nature and extent of activities in each category is a matter of professional judgement." What firms need to be able to demonstrate is that they have obtained appropriate confidence in their AI tools. The bar scales with the risk.
That framing matters because it changes what businesses are being asked to do. This isn't a compliance checklist to be completed once. It's a framework for thinking about whether the tools you're using have been properly governed, and whether you could demonstrate that to an inspection team.
The guidance was written, as the FRC put it, to support adoption, not to slow it down. For firms already using generative or agentic AI in engagements, the question is whether current governance, documentation, and review processes would hold up against the four-part framework. For businesses not yet deploying these tools, the question is whether the infrastructure is being built now, before deployment, rather than hastily after something goes wrong.
The audit trail of a firm that has genuinely engaged with this guidance looks different from one that hasn't. The first can demonstrate that AI was used in a controlled, documented, and reviewed way. The second has a lot of output and limited evidence that anyone was genuinely in charge of it. Inspectors will eventually be in a position to tell the difference. The firms building that evidence base now are the ones who will find it easier to answer.
The firm is. Procurement of an AI tool does not transfer accountability for its fitness for audit purposes. The guidance is clear that firms must demonstrate confidence in the tools they use, which means asking vendors questions they're not currently answering. Training data provenance, documented failure modes, testing against audit-specific tasks: these are the firm's responsibility to obtain, not the vendor's to offer.
The guidance doesn't prescribe a checklist, but the distinction it draws is clear enough. A reviewer who independently forms a view and uses AI output as one input among several is exercising meaningful oversight. A reviewer who lets the AI output set the anchor and checks it briefly is not. The difference is whether professional judgement precedes the AI's conclusion or follows it. In most current workflows, it follows. The FRC is asking firms to examine that honestly.
Yes, in the sense that the time to build governance infrastructure is before deployment, not after. The FRC's framing throughout is about obtaining appropriate confidence before tools go into use. A firm that hasn't yet deployed generative or agentic AI is in a better position, but only if it uses that position to build the right foundations now rather than treat it as a reason to delay the conversation.
Contact us today for a personalized demo and discover how ai/checklist & ai/numbers can elevate your audit capabilities.
