6 min read
Amazon Bedrock AgentCore became generally available by the end of April 2026. The CLI-based tooling brings CDK integration, A/B testing for agent versions, and a governance layer for the production introduction of AI agents on AWS. For enterprise teams looking to transition agentic AI workloads from the evaluation phase to production-ready deployments, this is the first notable infrastructure layer on the AWS side.
Key Takeaways
- CDK integration for Infrastructure as Code: AgentCore agents are defined, versioned, and deployed using AWS CDK. This enables the same GitOps workflows that are established for Lambda, ECS, and other AWS services, treating agents as reproducible, auditable artifacts.
- A/B testing between agent versions: Traffic splitting between agent variants allows controlled rollouts. Instead of “newly deployed = all users affected,” traffic can be gradually redirected, with a rollback option if metrics deviate.
- Observability layer integrated: AgentCore writes execution traces, latency, and error metrics directly to CloudWatch. Audit logs relevant for compliance requirements are automatically written.
- Model-agnostic at the AWS level: AgentCore works with all Bedrock models – Anthropic Claude, Amazon Titan, Llama variants. Switching the underlying model does not require any changes to the agent architecture.
RelatedAmazon S3 Files in GA: NFS Mount for ML Pipelines and EKS / Kubernetes 1.36 Haru: cgroup-v1 Removed, DRA Stable
What AgentCore Specifically Solves
What is Amazon Bedrock AgentCore? AgentCore is a managed service by AWS that standardizes the operation of AI agents in production environments. It combines a runtime infrastructure for agent execution, a CDK-based deployment pipeline, and an observability layer – comparable to what AWS App Runner does for containerized applications, but specialized for the specifics of agentic AI workloads.
The fundamental problem AgentCore addresses: AI agents are not classical microservices. They have non-deterministic execution paths, variable tool invocation contexts, and state problems that arise with strict stateless architecture. Standard deployment tooling for containerized applications works for agents only with significant custom code.
AgentCore consolidates the most common workarounds: session state management, retry logic for tool calls, structured logging of reasoning steps, and integration with AWS IAM for tool permissions. What teams have had to implement themselves in the past now comes as a managed layer.
CDK Integration: Agents as Code Artifacts
The CDK integration is the most relevant part for enterprise teams. An AgentCore agent is defined as a CDK construct. This construct describes model selection, tool configuration, session policy, and permissions—all as code, all versioned, and all deployed in the same CI/CD pipelines as other infrastructure as code.
In practice, this means an agent deployment is a pull request with review, testing, and approval—not a manual console click. This is the crucial step from proof of concept to production-ready deployment for enterprise compliance requirements (change management, audit trail).
Deployment Workflow with AgentCore CDK
- Agent Definition as a CDK Construct (Model, Tools, Session Policy)
- Deployment via cdk deploy—identical workflow to Lambda or ECS
- A/B Test Configuration: Traffic Split between old and new agent
- Monitoring via CloudWatch Dashboard (Latency, Error Rate, Tool Call Frequency)
- Rollback via CDK Rollback or Traffic Split to 0% for New Version
” For enterprise teams looking to transition agentic AI workloads from the evaluation phase to production-ready deployments, this is the first significant infrastructure layer on the AWS side.”
A/B-Testing for Agents – What It Means in Practice
A/B-Testing for AI agents is more complex than for traditional web applications. A new agent version might have different reasoning paths, call tools more or less frequently, and produce different output structures. Whether this is better or worse cannot be determined by pure latency measurements alone.
AgentCore integrates an evaluation framework that tracks not only technical metrics but also output quality scores. AWS has incorporated LLM-as-Judge mechanisms, where an evaluator model automatically assesses whether the agent responses meet the defined quality criteria. For teams conducting A/B tests between model versions or prompt variants, this provides the missing feedback loop.
The limitation: LLM-as-Judge is not a perfect evaluator. For domains with highly specific domain requirements (medical documentation, legal analysis), teams must define and calibrate their own evaluation criteria. AgentCore provides the mechanism, but the quality definition remains with the team.
AgentCore solves
- CDK-based agent deployment as code
- A/B testing with traffic splitting and evaluation
- Session-state management out-of-the-box
- CloudWatch integration for observability
- IAM-based tool permissions
Still manual
- Prompt engineering and agent design
- Domain-specific evaluation criteria
- Multi-cloud agent orchestration
- Cross-provider tool registries
- DSGVO-specific data processing compliance
For DACH enterprise teams, AgentCore is the next step after the evaluation phase. Those who have been experimenting with Bedrock agents and now plan to deploy productive workloads will get with AgentCore the deployment infrastructure that would otherwise have to be built manually. This significantly advances the maturity level for agentic AI in AWS enterprise environments.
Sources: AWS documentation for Bedrock AgentCore GA (April 2026), AWS Blog re:Invent 2025 Announcements.
Frequently Asked Questions
Is AgentCore only for Bedrock models or can it also be used with external models?
AgentCore is primarily designed for Bedrock models. Custom models deployed through SageMaker can be integrated via a Bedrock Custom Model Layer. External models outside of AWS, such as GPT-4o, Gemini, or self-hosted open-source models, are not directly integrable. For multi-provider agent orchestration, frameworks like LangChain or LlamaIndex are more flexible.
What is the cost of AgentCore compared to a self-built agent infrastructure?
AgentCore incurs an additional cost on top of the model costs for the managed runtime. Detailed pricing information is available in the AWS pricing documentation. The comparison with self-built infrastructure depends heavily on the development effort: Teams that develop their own session management, observability, and deployment pipelines typically invest several sprint cycles. AgentCore can quickly become cost-effective for teams without dedicated ML Ops capabilities.
How does AgentCore handle DSGVO requirements for EU data?
AgentCore runs in eu-central-1 (Frankfurt) and eu-west-1 (Ireland). When properly configured with region pinning, data does not leave the EU. Execution traces and session data are stored in CloudWatch within the same region. For industries with specific requirements (financial services, healthcare), a data protection review of specific agent use cases is recommended, as agent interaction data can be classified as personal data.
Can AgentCore handle multi-agent architectures?
Yes. AgentCore supports supervisor-agent-to-worker-agent communication. An orchestrator agent can invoke sub-agents through defined interfaces. Each agent step is logged and tracked separately, which significantly aids in debugging multi-agent workflows. The complexity of multi-agent coordination increases exponentially with the number of involved agents.
More from the MBF Media Network
Photo: Pexels
Source Title Image: Wikimedia Commons / Rene Schwietzke from Jena, Germany (CC BY 2.0)