Securely Launch and Scale AI Agents with Amazon Bedrock AgentCore Runtime

TL;DR

AgentCore Runtime offers a secure, serverless hosting environment designed specifically for AI agents and tools, addressing common production deployment blockers. AWS Bedrock AgentCore Runtime
It is framework- and model-agnostic, supporting existing code bases and multiple LLM providers (Bedrock managed models, Anthropic Claude, OpenAI API, Google Gemini) without architectural migrations.
Sessions run on dedicated microVMs, persisting for up to 8 hours to enable complex, stateful agent workflows while maintaining strong isolation between users.
Complete session isolation reduces cross-tenant data risk, with data cleared when a session terminates. AgentCore Memory provides durable state beyond a single session when needed.
Quick-start tooling and streaming support accelerate development, with SDK-based deployment and a starter toolkit for local environments.

Context and background

Organizations are increasingly excited about AI agents but frequently hit what the blog terms “proof of concept purgatory” when moving prototypes to production. Challenges include balancing standardization with diverse frameworks and models, increasing security complexity due to agent-wide state, managing identity for agents acting on behalf of users, handling multi-type inputs and large payloads, and predicting compute and cost without overprovisioning. Traditional hosting systems aren’t optimized for the evolving, stateful, and securely isolated nature of agent workloads. AgentCore Runtime is described as a purpose-built hosting environment that abstracts away container orchestration, session management, scalability, and security isolation, letting developers focus on agent functionality rather than infrastructure. AWS source. In this post, we explore how AgentCore Runtime enables a portable deployment pattern across frameworks and models, and how it supports streaming for chat applications and persistent sessions for multi-step workflows. The goal is production-grade reliability without forcing teams to rewrite code or migrate frameworks.

What’s new

Framework- and model-agnostic deployment: AgentCore Runtime can run code built on LangGraph, CrewAI, Strands, and other frameworks without architectural changes. You can mix and match LLMs from various providers—Amazon Bedrock, Anthropic Claude, OpenAI, Google Gemini—within a single deployment pattern.
Per-session microVMs with persistent state: Each session gets a dedicated microVM with isolated compute, memory, and filesystem resources. Sessions can persist for up to 8 hours, enabling multi-step reasoning and stateful interactions across invocations.
Distinct session lifecycle and security isolation: Sessions transition through Active, Idle, and Terminated states with defined rules (e.g., termination after 15 minutes of inactivity or 8 hours total). This isolation helps prevent cross-session data leakage and supports secure agent operations.
AgentCore Memory for durable state: For data and state that must survive beyond a single session, AgentCore Memory provides short-term and long-term memory abstractions for user histories, behavioral patterns, and insights across session boundaries.
Starter toolkit and streaming support: A Starter toolkit simplifies local development, and AgentCore Runtime streams chat-style responses out of the box for real-time agent interactions. Examples and SDK usage illustrate how minimal code changes can enable deployment.
Real-world cautionary note and solution: The post references a May 2025 case where cross-tenant data could occur due to tenant isolation gaps; AgentCore Runtime’s microVM isolation is presented as a robust mitigation.

Why it matters (impact for developers/enterprises)

For teams building sophisticated agent-powered experiences, AgentCore Runtime delivers:

Predictable production readiness: By handling container orchestration, session management, and security isolation, teams reduce operational overhead and avoid overprovisioning.
Flexible tooling and future-proofing: The framework- and model-agnostic approach preserves existing investments while staying adaptable to a changing LLM landscape.
Strong data isolation and security: Isolated microVMs and explicit session lifecycles minimize risks of data leakage across users or sessions.
Stateful workflows without external hacks: The persistent session model enables complex reasoning and context accumulation that previously required external state management workarounds.

Technical details or Implementation

AgentCore Runtime introduces a persistent execution model for AI agents, distinct from traditional stateless serverless compute. Rather than functions that spin up and terminate immediately, AgentCore provisions dedicated microVMs that can persist through a user session for up to eight hours. This enables multi-turn, stateful workflows where each new invocation builds on accumulated context and results from prior steps within the same session. Ephemeral session state is kept only within the session boundary and is purged when the session terminates, while longer-term durable state can be stored in AgentCore Memory when needed. Sessions follow a lifecycle with three states:

Active: A session is processing a request or running background tasks.
Idle: The session is ready for immediate use but not currently processing, reducing cold-start penalties.
Terminated: A session ends due to 15-minute inactivity, the 8-hour duration cap, or health-check failure. This lifecycle framework is paired with strict session isolation: complete microVM isolation is used to separate compute, memory, and file systems per session. This design helps prevent data leakage and cross-session contamination that can occur in multi-tenant environments. In parallel, persistent memory abstractions enable durable state across sessions when required by the application. AgentCore Runtime supports multiple model providers and frameworks through the AgentCore SDK. Developers can modify their code with minimal adjustments and deploy with or without the AgentCore Starter toolkit. The starter toolkit, alongside local development practices (e.g., using uv to manage environments and dependencies), accelerates initial testing and deployment. For chat-style applications, streaming is supported out of the box, with examples illustrating how to adapt existing synchronous code to streaming responses. The post also points to GitHub-hosted samples demonstrating these integration patterns and code changes for two representative frameworks. When it comes to data responsibilities, the service emphasizes that session data—such as conversation context and intermediate results—exists only within the active session and is purged upon termination. For durable memory needs, AgentCore Memory provides mechanisms to retain user histories, learned patterns, and key insights across sessions.

Table: Session lifecycle at a glance

| State | What it means | Visibility/Duration |---|---|---| | Active | Processing requests or background tasks | Within the current session lifecycle |Idle | Provisioned but not processing | Ready for immediate use |Terminated | Inactivity or max duration reached | Data purged; new session required | For references to memory and persistence features, see the broader AgentCore Memory offering described in the broader documentation referenced by the blog post. The content above summarizes the capabilities and implementation approach highlighted by AWS.

Key takeaways

AgentCore Runtime provides a secure, serverless hosting environment specifically designed for AI agents and tools, addressing common production hurdles.
The platform is framework- and model-agnostic, enabling teams to reuse existing code with minimal changes while mixing providers and models.
Persistent per-session microVMs enable sophisticated, stateful agent workflows with strong isolation to prevent data leakage.
Durable state can be managed with AgentCore Memory for cross-session continuity when needed.
Practical deployment is aided by a Starter toolkit, SDKs, and streaming support for chat-style applications.

FAQ

What is AgentCore Runtime?

It is a secure, serverless hosting environment for AI agents that provides per-session microVMs, framework- and model-agnostic deployment, and session isolation to support production-grade workloads. [AWS blog](https://aws.amazon.com/blogs/machine-learning/securely-launch-and-scale-your-agents-and-tools-on-amazon-bedrock-agentcore-runtime).
How does session isolation work?

Each session gets its own dedicated microVM with isolated compute, memory, and filesystem resources, ensuring agent state and credentials do not leak across sessions.
How long do sessions persist?

Sessions can persist for up to 8 hours, with an inactivity timeout of 15 minutes before termination.
How do I start using AgentCore Runtime?

Developers can start with the AgentCore Starter toolkit and the AgentCore SDK to modify and deploy code with minimal changes, and can deploy with or without the starter toolkit.
Where can I learn more about durable state mechanisms?

The blog introduces AgentCore Memory as a solution for maintaining durable state across session boundaries, with accompanying documentation.

References

AWS Blog: Securely launch and scale your agents and tools on Amazon Bedrock AgentCore Runtime — https://aws.amazon.com/blogs/machine-learning/securely-launch-and-scale-your-agents-and-tools-on-amazon-bedrock-agentcore-runtime

Securely Launch and Scale AI Agents with Amazon Bedrock AgentCore Runtime

TL;DR

Context and background

What’s new

Why it matters (impact for developers/enterprises)

Technical details or Implementation

Table: Session lifecycle at a glance

Key takeaways

FAQ

References

More news

First look at the Google Home app powered by Gemini

OpenAI reportedly developing smart speaker, glasses, voice recorder, and pin with Jony Ive

Shadow Leak shows how ChatGPT agents can exfiltrate Gmail data via prompt injection

Move AI agents from proof of concept to production with Amazon Bedrock AgentCore

Predict Extreme Weather in Minutes Without a Supercomputer: Huge Ensembles (HENS)

Scaleway Joins Hugging Face Inference Providers for Serverless, Low-Latency Inference