Prof. Fotis Liarokapis: AI Agent Safety Transparency Gap

26 February 2026

AI Agent Safety Transparency Gap

A new study from the University of Cambridge’s AI Agent Index project analysed 30 leading AI agents, including chat, browser, and workflow bots, and found that safety transparency is severely lacking. Only four agents had published formal “system cards” describing safety evaluations, while most developers publicly highlight capabilities but provide little evidence about risk testing or mitigation. In fact, 25 of the 30 agents disclosed no internal safety results and 23 showed no evidence of independent third-party testing, creating what researchers call a significant transparency gap as AI agents become integrated into everyday activities such as booking travel or managing finances.

The researchers warn that this lack of disclosure could hinder regulators, users, and scientists from understanding real-world risks, especially as agents grow more autonomous and capable of acting online. Security incidents and vulnerabilities (such as prompt-injection attacks) have rarely been reported publicly, and safety documentation is especially scarce among some regional developers. The study concludes that clearer standards, stronger reporting requirements, and independent testing are urgently needed so that society can properly evaluate and govern increasingly powerful AI agents before their widespread deployment.

More information:

https://www.cam.ac.uk/stories/ai-agent-index-safety

Pages

26 February 2026

AI Agent Safety Transparency Gap