Today, I’m excited to unveil the launch of Strands Agents, an open source SDK designed to facilitate the development and operation of AI agents with minimal coding. Strands enables the scaling of agent applications from basic scenarios to intricate implementations, covering both local development and production deployment. Various AWS teams are already utilizing Strands in live environments, such as the Amazon T Developer, AWS Glue, and VPC Reachability Analyzer. Now, I’m pleased to extend the opportunity for you to create your own AI agents using Strands.
Unlike traditional frameworks that require developers to craft complicated workflows, Strands streamlines agent creation by leveraging advanced models for planning, thought chaining, tool invocation, and self-reflection. With Strands, you simply need to define a prompt and a set of tools in your code to construct an agent, after which you can test it locally and deploy it to the cloud. Just as the two strands of DNA intertwine, Strands connects the model and tools essential for agent functionality. It orchestrates the agent’s next steps and tool executions through sophisticated reasoning capabilities. For more complex applications, developers can customize their agents’ behaviors in Strands, such as tool selection, context management, session state storage, and multi-agent setups. Strands is versatile enough to run on any platform and supports a variety of models with reasoning and tool-utilization capabilities, including those from Amazon Bedrock, Anthropic, Ollama, Meta, and others via LiteLLM.
The Strands Agents community is expanding, with several companies contributing support and enhancements, including Accenture, Anthropic, Langfuse, mem0.ai, Meta, PwC, Ragas.io, and Tavily. For example, Anthropic has provided support for their models through the Anthropic API, while Meta has contributed compatibility for Llama models through the Llama API. To get started with Strands Agents, be sure to check out our GitHub page.
Our Journey in Developing Agents
I primarily work on Amazon T Developer, an AI-driven assistant for software development. My team and I embarked on our journey to build AI agents in early 2023, coinciding with the release of the original ReAct (Reasoning and Acting) scientific paper. This groundbreaking paper demonstrated the ability of large language models (LLMs) to reason, plan, and act within their environments. For instance, LLMs could deduce the need for an API call to fulfill a task and generate the necessary inputs for that call. We quickly realized that LLMs could serve as agents to accomplish a range of tasks, from complex software development to operational troubleshooting.
During that period, LLMs were not typically trained to function as agents; their primary training revolved around natural language processing. Effectively utilizing an LLM for reasoning and action involved intricate prompt designs, response parsers, and orchestration logic. Achieving syntactically correct JSON outputs was already a significant challenge! To prototype and deploy agents, my team relied on various complex agent framework libraries that provided necessary scaffolding and orchestration for the agents to succeed. Despite these frameworks, we often spent months fine-tuning and adjusting our agents before they were production-ready.
Since then, we’ve witnessed a remarkable evolution in the reasoning and tool-utilization abilities of large language models. We discovered that the complexity of orchestration was no longer needed, as modern models possess native capabilities for reasoning and tool use. Some of the frameworks we previously used began to hinder our ability to fully exploit the advancements in LLMs. Improvements in LLMs didn’t translate to faster development cycles with those frameworks; it still took us months to prepare an agent for production.
In response, we developed Strands Agents to eliminate this complexity for our Q Developer teams. By leveraging the inherent capabilities of cutting-edge models, we significantly reduced our time to market and enhanced the user experience. What used to take months to transition from prototype to production can now be achieved in just days or weeks using Strands.
Core Concepts of Strands Agents
An agent can be simply defined as a combination of three key elements: 1) a model, 2) tools, and 3) a prompt. These components work together to accomplish tasks autonomously, whether it’s answering queries, generating code, planning vacations, or optimizing financial portfolios. In a model-driven approach, the agent utilizes the model to dynamically navigate its steps and employ tools to achieve its goals.
To create an agent using the Strands Agents SDK, you define these three components in your code:
- Model: Strands supports a variety of models. You can leverage any model in Amazon Bedrock that allows for tool use and streaming, models from Anthropic’s Claude family via the Anthropic API, Llama models through the Llama API, and many others like OpenAI through LiteLLM. You can also create your own custom model provider within Strands.
- Tools: You can select from thousands of published Model Context Protocol (MCP) servers to serve as tools for your agent. Strands includes over 20 pre-built example tools, such as those for file manipulation, API requests, and AWS API interactions. Additionally, any Python function can be easily utilized as a tool using the Strands @tool decorator.
- Prompt: You provide a natural language prompt that outlines the task for your agent, such as responding to a user query. You may also include a system prompt that gives general instructions and expected behaviors for the agent.
The agent engages in a continuous loop with its model and tools until it accomplishes the task specified by the prompt. This agentic loop is the essence of Strands’ functionality. During each iteration, Strands invokes the LLM with the prompt and context, along with descriptions of the tools available to the agent. The LLM can respond in natural language for the end user, outline a series of steps, reflect on previous actions, and/or choose one or more tools to utilize. When a tool is selected, Strands executes it and returns the result to the LLM. Once the LLM completes its task, Strands delivers the final outcome of the agent.
In Strands’ model-driven framework, tools play a crucial role in customizing agent behavior. For instance, tools can retrieve pertinent documents from a knowledge base, make API calls, execute Python logic, or simply return a static string with additional model instructions. Tools also enable complex use cases in a model-driven approach, as demonstrated by Strands Agents’ pre-built tools.
For further insights, check out another blog post on this topic here and learn from experts at chvnci.com. Additionally, if you’re looking for onboarding tips, visit this excellent resource here.
Leave a Reply