Skip to content
JARVIS
Source: theverge.com

AI agents: science fiction meets reality, but not primetime yet

Sources: https://www.theverge.com/the-stepback-newsletter/767376/ai-agents-jarvis-what-can-they-do, The Verge AI

TL;DR

  • AI agents moved from a sci-fi concept to real products, but practical use remains uneven.
  • The 2024 deployment phase showed meaningful progress yet frequent errors and inefficiencies.
  • Klarna’s 2024 milestone highlighted the potential scale of AI agents in customer service.
  • By mid-2025, incremental feature progress continues, with ongoing bets from leading firms and new government-facing platforms.
  • The question remains: what should AI agents actually do for everyday users, beyond logistics?

Context and background

The idea of an AI agent sits a step beyond chatbots: a system that can perform multistep, complex tasks on your behalf without constant back-and-forth. It can set up a plan, create a to-do list of subtasks, and work toward a user’s end goal with minimal prompting. The J.A.R.V.I.S.-style vision from popular fiction has long shaped expectations, even as real deployments lag behind hype. The term gained particular traction in 2023, then began to appear in earnest in 2024 as companies started to move from concept to code and, increasingly, into production. The Verge’s analysis tracks that arc, noting that 2024 was a year of deployment rather than flawless performance. For more context on how this space has evolved, see the ongoing coverage in The Stepback. The Verge AI In the real world, AI coding stood out as the most concrete, widely adopted use case for agentic AI for some time. At large tech firms like Microsoft and Google, AI agents were already responsible for writing a meaningful share of code—up to about 30 percent in some contexts. Startups and scaleups also leaned on AI coding tools to generate revenue from enterprise clients. The broader consumer-facing promise—agents that can book travel, generate visuals, manage calendars, and handle tasks end-to-end—remains aspirational. Reviews consistently highlighted the gap between the ideal and the practical reality, even as the field pressed forward. The Verge AI

What’s new

Recent milestones mark a progression from prototype to more capable tools, even as many users report imperfect experiences. A quick milestone table summarizes the key steps mentioned in contemporary coverage: | Milestone | Date | Impact |---|---|---| | Klarna’s AI assistant proves capable of handling substantial workload | Feb 2024 | After one month, AI assistant completed the work of 700 full-time customer service agents and automated two-thirds of customer service chats. |Anthropic introduces “Computer Use” for Claude | Oct 2024 | Claude can browse, search, access platforms, and complete complex tasks on a user’s behalf; initial reviews note notable progress but room for improvement. |OpenAI releases Operator | Jan 2025 | Tool for filling forms, ordering groceries, booking travel, creating memes; user reports describe buggy and slow performance but a meaningful step forward. |OpenAI launches Deep Research | Feb 2025 | Agent capable of compiling long research reports; reception mixed but some praised length and depth. |ChatGPT Agent combines Deep Research and Operator | Jul 2025 | A more integrated agent product; still not flawless, but marks a significant evolution over earlier offerings. | These entries illustrate a trajectory from hype to deployed capabilities, with ongoing iterations refining both coding and consumer-facing experiences. The Verge AI

Why it matters (impact for developers/enterprises)

The mid-2020s saw a broad push by major tech players to embed agentic capabilities into products and business workflows. The industry’s bet is that AI agents can scale human productivity by autonomously managing tasks across tools and platforms. The fact that large companies publicly discussed their investments and roadmaps signals a shift from rumor to a more concrete product strategy. For enterprises, AI coding remained a primary revenue driver for many firms’ agent-focused offerings, underscoring an early but durable alignment between AI agents and software development workflows. Governments and enterprise customers are also exploring dedicated AI agent platforms to address public-sector needs. The Verge AI

Technical details or Implementation (what’s actually being built)

  • The core concept is that a true AI agent can act with a degree of autonomy, assembling subtasks to reach a goal, rather than relying solely on back-and-forth prompts. This represents a shift from chat-based interactions to task execution on the user’s behalf. The industry’s experimentation has focused on both consumer-facing agents and enterprise-grade solutions.
  • The “coding” use case has been a reliable, near-term anchor: AI agents can assist or automate coding tasks, contributing a sizable portion of code in some environments. This has helped fuel growth in enterprise tooling and services around AI-assisted development.
  • In terms of product evolution, early tools emphasized basic automation and task execution, while later offerings attempted to handle more complex activities like form filling, travel booking, and research assembly. OpenAI’s sequence—Operator for forms and bookings, followed by Deep Research for long-form reports, then a combined ChatGPT Agent—reflects a trend toward more integrated capabilities, even as users report performance gaps. The Verge AI
  • Industry moves include strategic hiring and partnerships to push AI agents forward. Google hired Windsurf’s CEO and R&D team members to push its agent projects, highlighting the competitive arms race in this space. Anthropic added browser-enabled capabilities (a Chrome extension for Claude) to broaden where Claude can operate. The Verge AI

Key takeaways

  • AI agents are transitioning from speculative concept to deployment reality, but practical usability remains uneven.
  • The near-term success stories are strongest in AI-assisted coding and enterprise workflows; consumer-facing experiences are improving more gradually.
  • The field is characterized by rapid iteration and ongoing investments, with major players pursuing broader capabilities and government-focused platforms.
  • A core question for builders and buyers remains: should AI agents handle logistics only, or also the more personal, nuanced tasks of everyday life? The current answer leans toward the former, with broader personal applications still developing. The Verge AI

FAQ

  • What exactly is an AI agent, as discussed here?

    It’s a system that can perform multistep, complex tasks on your behalf, forming its own to-do list of subtasks to reach a goal, beyond simple chat prompts.

  • What concrete progress did Klarna’s example illustrate in early 2024?

    The company said its AI assistant, powered by OpenAI’s tech, had completed the work of 700 full-time agents and automated two-thirds of customer service chats after one month.

  • Are AI agents ready for widespread consumer use?

    The consensus is that progress is real but imperfect; tools described as buggy, slow, or not always efficient in practice, even as iterations continue.

  • What future directions are industry players pursuing?

    Expect continued improvements in AI coding, consumer-facing agent capabilities, and government-focused AI platforms, along with ongoing mergers, acquisitions, and feature rollouts. [The Verge AI](https://www.theverge.com/the-stepback-newsletter/767376/ai-agents-jarvis-what-can-they-do)

References

More news