29 May 20266 min readBy Olga Magomedova

AI stress tests and the trader at the desk

The Bank of England's Financial Policy Committee has asked supervisors to examine agentic AI in payments and markets. Olga Magomedova on why the same stress-testing logic that regulators now apply to the system belongs on every trader's desk.

A regulator starts naming the failure modes

In April, the Bank of England's Financial Policy Committee published its quarterly record. The headline was measured. Financial system participants have not yet adopted more advanced forms of AI, such as generative or agentic AI, in a manner that would present systemic risk, but risks are likely to increase, potentially rapidly, amid growing intent among financial firms to expand their deployment of advanced AI.

The Committee then did something more interesting than warn. It supported the Bank's and the Financial Conduct Authority's initiatives to continue monitoring the adoption of AI by regulated firms and asked them to undertake further work on agentic AI, focused on use cases in payments and financial markets. Behind the careful language sits an engineering instinct. Name the failure modes before they occur. Build the test before the load arrives.

What the Treasury Committee actually asked for

The earlier Treasury Select Committee report was sharper. It found that the Financial Conduct Authority, the Bank of England and HM Treasury are not doing enough to manage the risks presented by AI, and that by taking a wait-and-see approach to AI in financial services, the three authorities are exposing consumers and the financial system to potentially serious harm.

Its central recommendation reads like a directive borrowed from any safety-critical industry. To build firms' readiness for AI-driven market shocks, the Bank of England and the Financial Conduct Authority must conduct AI-specific stress testing. The same Committee also asked that the major AI and cloud providers be designated as critical third parties by the end of the year.

For Magomedova, this is the right vocabulary at last. Stress testing is not a slogan. It is the act of asking what a system does at the edge of its design envelope, and writing the answer down before the edge arrives.

The shape of the systemic risk

The risk regulators are circling has a specific shape. When many participants rely on similar models, trained on similar data, served by a small number of providers, the diversity that ordinarily keeps a market liquid begins to thin. The Bank's framework names this directly. Its existing risk framework highlights four main channels: core firm decisions, financial markets, AI service providers, and cyber risk.

The mechanism is familiar to any engineer who has worked on common-mode failures. Independent components are not independent if they share a single supplier, a single library, or a single training set. They fail together. The market equivalent is a sell order that arrives from a thousand desks in the same minute because a thousand models reached the same conclusion at the same time.

The same logic, scaled to one desk

The regulatory frame is the system. The trader's frame is the account. The logic is the same in both directions.

A trader who has integrated AI tools into a discretionary practice has, in miniature, the same architecture the regulators are now examining. There is a model whose internals are not fully visible. There is a vendor whose service may degrade. There is the possibility that the trader's positions correlate with the positions of every other user of the same tool. The trader who has not stress tested for these conditions is not running the practice. The vendor is.

An honest stress test at the level of a single account is not complicated. What happens if the AI signal disappears for a week. What happens if its confidence number is systematically miscalibrated. What happens if the platform reroutes through a different model overnight without telling its users. What does the trader's plan instruct in each case.

The CTA evidence and what it does not prove

There is a temptation to read the recent performance of machine-learning hedge funds as evidence that the question has been settled. Machine-learning hedge funds surged on the recent jump in precious metals prices, before sidestepping last week's sell-off, and the sector notched up one of its best months since 2000 in January. Those are real numbers from a real strategy.

They are not a verdict on AI in trading. They are a verdict on a particular set of medium-term trend-following programmes operating across a diversified set of markets. The strategy survives because it is diversified, sized, and reviewed. The model is one input. The discipline is the artefact that holds.

Returning to the principle

Magomedova's working position on AI in markets has not moved. The tool has become more capable. The responsibility has not transferred. A regulator asking for stress tests at the system level is, in effect, asking firms to behave the way a serious individual trader already behaves. Define the worst plausible week. Write down what the system does inside it. Test the answer in a quiet period rather than discovering it in a loud one.

AI is a powerful tool. But tools do not replace responsibility.

The line worth keeping is the one already in the plan. Every position must be one the trader can defend in their own words, with or without the model that suggested it. If the regulators succeed in writing AI-specific stress tests for the financial system, the work for the individual trader will not be reduced. It will simply be visible. The trader who built that work years before the regulator asked for it is the one who will not have to retrofit it in a hurry.