Daily AI Agent News - January 23, 2026

AI Agent News Today

Friday, January 23, 2026

AI Agents Still Struggle With Real Work Tasks

A major new benchmark called Apex-Agents tested leading AI models on actual white-collar jobs in banking, consulting, and law. Results are sobering: the best performer Google's Gemini 3 Flash only achieved 24% success rate. The core problem? AI agents can't handle information scattered across multiple tools like Slack and Google Drive the way humans do. This means workplace automation is moving slower than predicted.

Enterprise Leaders Prioritize Safety Over Speed

A Dynatrace report surveying 919 senior leaders reveals why: 52% cite security and compliance concerns as the main barrier to deploying AI agents. Rather than rushing to automate, 69% of organizations still have humans verify AI decisions. The takeaway—reliability and governance matter more than raw capability right now.

New Testing Tool Makes AI More Trustworthy

Researchers released Detect, a framework that systematically tests deep learning models by manipulating features in their latent space. Unlike standard accuracy tests, it reveals hidden bugs and vulnerabilities, helping teams understand exactly how AI systems make decisions. This tool is crucial as enterprises scale agents responsibly.

Bottom Line: Don't assume AI agents are workplace-ready. Focus first on governance, testing, and human oversight before deploying at scale.

More News

Put an agent to work

Stop reading agent demos. Give one a job you repeat every week.

Describe the work, test the first result, and keep the agent available without running your own server.

Runs without your laptopBrowser + messaging appsBackups and clonesMemory survives restarts

Create a working agent See how it works

Plans start at $29/month. Cancel anytime.

Hosted agent

OpenClaw or Hermes

saved state

Browser

Slack

“I checked the inbox, handled the routine messages, and sent you the one question that needs a decision.”

Create an AI worker that keeps running after this tab closes.

Open Agent Factory

News: January 20 News: January 21 News: January 22

News: January 24 News: January 25 News: January 26