AI agents: science fiction no longer, but primetime still far away
Sources: https://www.theverge.com/the-stepback-newsletter/767376/ai-agents-jarvis-what-can-they-do, The Verge AI
TL;DR
- AI agents have progressed far beyond early chatbots but remain imperfect for everyday, consumer use.
- In 2024 the hype and deployment cycle accelerated, with Klarna claiming its AI assistant could replace 700 full-time agents and automate two-thirds of chats after one month.
- A core real-world use case remains AI coding, with up to 30% of code now written by AI agents at major companies; enterprise revenue for AI coding tools remains a key driver for players like OpenAI and Anthropic. The Verge AI
- In 2025, concrete tools began to ship: Anthropic’s Computer Use, OpenAI’s Operator, and Deep Research; OpenAI later combined these into ChatGPT Agent, signaling progress but ongoing reliability challenges.
- Expect more incremental features, higher compute use, and broader enterprise/government experimentation, alongside continued hype and mergers and acquisitions.
Context and background
The concept of an AI agent sits a step beyond chatbots: an automated system that can perform multistep, complex tasks on your behalf with limited back-and-forth. The idea traces public imagination to J.A.R.V.I.S. from the Iron Man universe, but practical discussions typically anchor on the ability to create a to-do list of subtasks and push toward a user’s end goals. The term gained serious traction in 2023 as executives debated how to realize an “agent” that can autonomously act across data and tools, not just respond to queries. By 2024, deployment began in earnest, and early results were often noisy with errors. The Verge AI Historically, AI coding stood out as the most concrete, near-term use case for agentic AI. Engineers routinely used AI agents to assist with coding, and those tools became a meaningful revenue source for enterprise customers. Reports highlighted that up to 30% of code in environments like Microsoft and Google projects was being written by AI agents. This focus on coding helped sustain investment even as consumer-facing capabilities lagged behind the lofty, broad-application promises. In 2024–2025, the field shifted from hype to tangible features. Anthropic introduced a “Computer Use” capability that let Claude browse, search, access platforms, and complete tasks on a user’s behalf. OpenAI followed with Operator, marketed for filling forms, ordering groceries, booking travel, and even creating memes, though users found it buggy or slow at times. Later, OpenAI released Deep Research to compile long research reports, which some found impressive in length but mixed in substance. In July, OpenAI merged Deep Research and Operator into ChatGPT Agent, delivering a more integrated offering that still faced practical reliability hurdles. The path from concept to consumer-ready tools remains uneven, but progress is real. The Verge AI
What’s new
The last year has been marked by a sequence of product introductions and refinements aimed at moving from the laboratory toward real-world usability:
- Anthropic’s Computer Use enabled Claude to operate a computer in a human-like way, with browsing and multi-platform actions.
- OpenAI’s Operator promised end-to-end automation for routine tasks like form-filling and travel bookings, but initial use showcased performance gaps.
- OpenAI’s Deep Research aimed to generate long-form research outputs and reports.
- The combination of these capabilities culminated in ChatGPT Agent, offering a more integrated agent experience than earlier tools.
- Industry momentum has led to heightened investment: Google has hired teams from Windsurf to advance AI agent projects, and Anthropic/OpenAI have continued to push enterprise and government-focused platforms. A Chrome extension from Anthropic expanded Claude’s browser reach. The Verge AI
A concise timeline (highlights)
| Year | Milestone | Notes |---|---|---| | 2023 | AI agents enter the conversation | The idea gains traction but practical use remains limited. |2024 | Deployment accelerates | Klarna reports dramatic reductions in human labor and chat load; hype persists. |2025 | Feature rollouts | Anthropic’s Computer Use, OpenAI’s Operator, Deep Research mature into ChatGPT Agent, with enterprise and government interest growing. |
Why it matters (impact for developers/enterprises)
For developers, the central takeaway is that AI coding remains the most tangible, near-term application of agentic AI. It has proven to be a practical, if imperfect, workflow enhancer, and it underpins a significant portion of enterprise tool revenue. For enterprises and government users, the promise lies in automating routine tasks and data-heavy research workflows, while grappling with reliability, speed, and accuracy concerns. The industry response has included aggressive hiring, investments in compute and R&D, and a wave of product iterations designed to inch closer to a consumer-friendly “everyman” AI agent. The history also underscores a recurring pattern: many early efforts show promise but require careful evaluation of true usefulness in day-to-day life and business operations. The Verge AI
Technical details or Implementation
- AI coding as the leading real-world use demonstrates that agents can contribute to software development, with substantial portions of code being generated by AI agents in large tech organizations.
- Early consumer-facing tools emphasized autonomous capability across tasks: browsing, form filling, travel booking, and content generation, but users reported bugs, slowdowns, and inefficiencies.
- The 2025 wave moved toward integration: combining capabilities into a single agent product (ChatGPT Agent) that can perform a broader set of tasks with fewer handoffs, while still contending with reliability and user experience challenges.
- Enterprise and government-oriented AI platforms have emerged alongside consumer tools, reflecting a broader adoption curve and a focus on governance, compliance, and scale. The Verge AI
Key takeaways
- The idea of AI agents has shifted from pure hype toward measurable, if incremental, capabilities that are usable in certain workflows (notably coding and structured automation).
- The most concrete progress sits in coding assistance, with a sizable share of software development work now aided by AI agents.
- Early consumer tools delivered meaningful but imperfect experiences; performance gaps remain a core constraint.
- Investment momentum continues, including hiring, tooling, and new platforms aimed at enterprise and government, signaling a broader market push even as the consumer experience evolves.
- The central question remains: what should a practical AI agent be able to do for users, and how well should it handle both logistics and personal, human tasks? The current answer leans toward logistics and recurring tasks rather than deeply personal, nuanced support.
FAQ
-
What is an AI agent, in simple terms?
n AI agent is a system that can perform multistep, complex tasks on your behalf with limited back-and-forth, essentially creating its own to-do list to reach your goal.
-
What were the notable 2024 milestones for AI agents?
2024 saw a shift from hype to deployment, with early claims like Klarna’s showing of replacing 700 full-time agents and two-thirds of chats after one month, followed by a wave of product announcements and incremental features from major players.
-
What happened in 2025 in the AI agent space?
2025 introduced concrete capabilities at the level of “Computer Use” (Anthropic), “Operator” (OpenAI), and “Deep Research”; these were later integrated into ChatGPT Agent, signaling more cohesive but still imperfect tools.
-
What is the current limitation of AI agents?
While progress is real, many tools remain buggy, slow, or not fully efficient for everyday use, and there is ongoing debate about how capable they should be in handling personal, human tasks.
-
why is this evolution important for enterprises?
Enterprises and governments are adopting agent-like tools to automate forms, data handling, and research workflows, driving demand for more robust, scalable, and governed AI agent solutions.
References
More news
First look at the Google Home app powered by Gemini
The Verge reports Google is updating the Google Home app to bring Gemini features, including an Ask Home search bar, a redesigned UI, and Gemini-driven controls for the home.
Meta’s failed Live AI smart glasses demos had nothing to do with Wi‑Fi, CTO explains
Meta’s live demos of Ray-Ban smart glasses with Live AI faced embarrassing failures. CTO Andrew Bosworth explains the causes, including self-inflicted traffic and a rare video-call bug, and notes the bug is fixed.
OpenAI reportedly developing smart speaker, glasses, voice recorder, and pin with Jony Ive
OpenAI is reportedly exploring a family of AI devices with Apple's former design chief Jony Ive, including a screen-free smart speaker, smart glasses, a voice recorder, and a wearable pin, with release targeted for late 2026 or early 2027. The Information cites sources with direct knowledge.
Shadow Leak shows how ChatGPT agents can exfiltrate Gmail data via prompt injection
Security researchers demonstrated a prompt-injection attack called Shadow Leak that leveraged ChatGPT’s Deep Research to covertly extract data from a Gmail inbox. OpenAI patched the flaw; the case highlights risks of agentic AI.
How chatbots and their makers are enabling AI psychosis
Explores AI psychosis, teen safety, and legal concerns as chatbots proliferate, based on Kashmir Hill's reporting for The Verge.
Google expands Gemini in Chrome with cross-platform rollout and no membership fee
Gemini AI in Chrome gains access to tabs, history, and Google properties, rolling out to Mac and Windows in the US without a fee, and enabling task automation and Workspace integrations.