Introduction
This article introduces the ReAct pattern for improved capabilities and demonstrates how to create AI agents from scratch. It covers testing, debugging, and optimizing AI agents in addition to tools, libraries, environment setup, and implementation. This tutorial gives users the skills they need to create effective AI agents, regardless of whether they are developers or enthusiasts.
Learning Objectives
- Grasp the fundamental concepts of AI agents and their significance in various applications.
- Learn how to implement the Reason + Act (ReAct) pattern in AI agents to enhance their capabilities.
- Set up the necessary tools and libraries required to build AI agents from scratch.
- Develop an AI agent using Python, integrate various actions, and implement a reasoning loop.
- Effectively test and debug the AI agent to ensure it functions as expected.
- Improve the robustness and security of the AI agent and add more capabilities.
- Identify practical applications of AI agents and understand their future prospects.
This article was published as a part of the Data Science Blogathon.
Understanding AI Agents
AI agents are self-governing creatures that employ sensors to keep an eye on their environment, process information, and accomplish predefined goals. They can be anything from basic bots to sophisticated systems that can adjust and learn over time. Typical instances include recommendation engines like Netflix and Amazon’s, chatbots like Siri and Alexa, and self-driving cars from Tesla and Waymo.
Also essential in a number of sectors are these agents: UiPath and Blue Prism are examples of robotic process automation (RPA) programs that automate repetitive processes. DeepMind and IBM Watson Health are examples of healthcare diagnostics systems that help diagnose diseases and recommend treatments. In their domains, AI agents greatly improve productivity, precision, and customisation.
Why AI Agents are Important?
These agents play a critical role in improving our daily lives and accomplishing particular objectives.
AI agents are significant because they can:
- lowering the amount of human labor required to complete routine operations, resulting in increased production and efficiency.
- analyzing enormous volumes of data to offer conclusions and suggestions that support decision-making.
- utilizing chatbots and virtual assistants to provide individualized interactions and assistance.
- enabling complex applications in industries like as banking, transportation, and healthcare.
In essence, AI agents are pivotal in driving the next wave of technological advancements, making systems smarter and more responsive to user needs.
Applications and Use Cases of AI Agents
AI agents have a wide range of applications across various industries. Here are some notable use cases:
- Customer Service: AI agents in the form of chatbots and virtual assistants handle customer inquiries, resolve issues, and provide personalized support. They can operate 24/7, offering consistent and efficient service.
- Finance: Financial forecasting, algorithmic trading, and fraud detection are applications of AI agents. They perform trades based on market trends, examine transaction data, and spot questionable patterns.
- Healthcare: AI agents assist in diagnosing diseases, recommending treatments, and monitoring patient health. They analyze medical data, provide insights, and support clinical decision-making.
- Marketing: AI agents personalize marketing campaigns, segment audiences, and optimize ad spend. They analyze customer data, predict behavior, and tailor content to individual preferences.
- Supply Chain Management: AI systems estimate demand, improve inventory levels, and simplify logistics. They examine information from manufacturers, suppliers, and retailers to guarantee smooth operations.
Brief Introduction of ReAct Pattern
The ReAct pattern operates in a loop of Thought, Action, Pause, Observation, Answer.
This loop allows the AI agent to reason about the input, act on it by leveraging external resources, and then integrate the results back into its reasoning process. By doing so, the AI agent can provide more accurate and contextually relevant responses, significantly expanding its utility.
The ReAct pattern is a potent design pattern that combines reasoning and action-taking skills to improve the capabilities of AI agents. LLMs such as GPT-3 or GPT-4 benefit greatly from this technique because it allows them to interface with other tools and APIs to carry out activities beyond their original programming.
The ReAct pattern operates in a cyclic loop consisting of the following steps:
- Thought: The AI agent processes the input and reasons about what needs to be done. This involves understanding the question or command and determining the appropriate action to take.
- Action: Based on the reasoning, the agent performs a predefined action. This could involve searching for information, performing calculations, or interacting with external APIs.
- Pause: The agent waits for the action to be completed. This is a crucial step where the agent pauses to receive the results of the action performed.
- Observation: The agent observes the results of the action. It analyzes the output received from the action to understand the information or results obtained.
- Answer: The agent uses the observed results to generate a response. This response is then provided to the user, completing the loop.
Importance and Benefits of Using ReAct
The ReAct pattern is important for several reasons:
- Enhanced Capabilities: By integrating external actions, the AI agent can perform tasks that require specific information or computations, thus enhancing its overall capabilities.
- Improved Accuracy: The pattern allows the AI agent to fetch real-time information and perform accurate calculations, leading to more precise and relevant responses.
- Flexibility: The ReAct pattern makes AI agents more flexible and adaptable to various tasks. They can interact with different APIs and tools to perform a wide range of actions.
- Scalability: This pattern allows for the addition of new actions and capabilities over time, making the AI agent scalable and future-proof.
- Real-World Applications: The ReAct pattern enables AI agents to be deployed in real-world scenarios where they can interact with dynamic environments and provide valuable insights and assistance.
Python is a versatile and powerful programming language that is widely used in AI and machine learning due to its simplicity and extensive library support. For building AI agents, several Python libraries are essential:
- OpenAI API: This library allows you to interact with OpenAI’s language models, such as GPT-3 and GPT-4. It provides the necessary functions to generate text, answer questions, and perform various language- tasks.
- httpx: This is a powerful HTTP client for Python that supports asynchronous requests. It is used to interact with external APIs, fetch data, and perform web searches.
- re (Regular Expressions): This module provides support for regular expressions in Python. It is used to parse and match patterns in strings, which is useful for processing the AI agent’s responses.
Introduction to OpenAI API and httpx Library
The OpenAI API is a robust platform that provides access to advanced language models developed by OpenAI. These models can understand and generate human-like text, making them ideal for building AI agents. With the OpenAI API, you can:
- Generate text based on prompts
- Answer questions
- Perform language translations
- Summarize text
- And much more
The httpx library is an HTTP client for Python that supports both synchronous and asynchronous requests. It is designed to be easy to use while providing powerful features for making web requests. With httpx, you can:
- Send GET and POST requests
- Handle JSON responses
- Manage sessions and cookies
- Perform asynchronous requests for better performance
Together, the OpenAI API and httpx library provide the foundational tools needed to build and enhance AI agents, enabling them to interact with external resources and perform a wide range of actions.
Setting Up the Environment
Let us now set up the environment by following certain steps:
Step1: Installation of Required Libraries
To get started with building your AI agent, you need to install the necessary libraries. Here are the steps to set up your environment:
- Install Python: Ensure you have Python installed on your system. You can download it from the official Python website:
- Set Up a Virtual Environment: It’s good practice to create a virtual environment for your project to manage dependencies. Run the following commands to set up a virtual environment:
python -m venv ai_agent_env
source ai_agent_env/bin/activate # On Windows, use `ai_agent_envScriptsactivate`
- Install OpenAI API and httpx: Use pip to install the required libraries:
pip install openai httpx
- Install Additional Libraries: You may also need other libraries like re for regular expressions, which is included in the Python Standard Library, so no separate installation is required.
Step2: Setting Up API Keys and Environment Variables
To use the OpenAI API, you need an API key. Follow these steps to set up your API key:
- Obtain an API Key: Sign up for an account on the OpenAI website and obtain your API key from the API section.
- Set Up Environment Variables: Store your API key in an environment variable to keep it secure. Add the following line to your .bashrc or .zshrc file (or use the appropriate method for your operating system):
export OPENAI_API_KEY='your_openai_api_key_here'
- Access the API Key in Your Code: In your Python code, you can access the API key using the os module:
import os
openai.api_key = os.getenv('OPENAI_API_KEY')
With the environment set up, you are now ready to start building your AI agent.
Building the AI Agent
Let us now build the AI agent.
Creating the Basic Structure of the AI Agent
To build the AI agent, we will create a class that handles interactions with the OpenAI API and manages the reasoning and actions. Here’s a basic structure to get started:
import openai
import re
import httpx
class ChatBot:
def __init__(self, system=""):
self.system = system
self.messages = []
if self.system:
self.messages.append({"role": "system", "content": system})
def __call__(self, message):
self.messages.append({"role": "user", "content": message})
result = self.execute()
self.messages.append({"role": "assistant", "content": result})
return result
def execute(self):
completion = openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=self.messages)
return completion.choices[0].message.content
This class initializes the AI agent with an optional system message and handles user interactions. The __call__ method takes user messages and generates responses using the OpenAI API.
Implementing the ReAct Pattern
To implement the ReAct pattern, we need to define the loop of Thought, Action, Pause, Observation, and Answer. Here’s how we can incorporate this into our AI agent:
Define the Prompt
prompt = """
You run in a loop of Thought, Action, PAUSE, Observation.
At the end of the loop you output an Answer.
Use Thought to describe your thoughts about the question you have been asked.
Use Action to run one of the actions available to you - then return PAUSE.
Observation will be the result of running those actions.
Your available actions are:
calculate:
e.g. calculate: 4 * 7 / 3
Runs a calculation and returns the number - uses Python so be sure to use floating point
syntax if necessary
wikipedia:
e.g. wikipedia: Django
Returns a summary from searching Wikipedia
simon_blog_search:
e.g. simon_blog_search: Django
Search Simon's blog for that term
Example session:
Question: What is the capital of France?
Thought: I should look up France on Wikipedia
Action: wikipedia: France
PAUSE
You will be called again with this:
Observation: France is a country. The capital is Paris.
You then output:
Answer: The capital of France is Paris
""".strip()
Define the query Function
action_re = re.compile('^Action: (w+): (.*)
The query function runs the ReAct loop by sending the question to the AI agent, parsing the actions, executing them, and feeding the observations back into the loop.
Implementing Actions
Let us now look into the implementing actions.
Action: Wikipedia Search
The Wikipedia search action allows the AI agent to search for information on Wikipedia. Here’s how to implement it:
def wikipedia(q):
response = httpx.get("https://en.wikipedia.org/w/api.php", params={
"action": "query",
"list": "search",
"srsearch": q,
"format": "json"
})
return response.json()["query"]["search"][0]["snippet"]
Action: Blog Search
The blog search action allows the AI agent to search for information on a specific blog. Here’s how to implement it:
def simon_blog_search(q):
response = httpx.get("https://datasette.simonwillison.net/simonwillisonblog.json", params={
"sql": """
select
blog_entry.title || ': ' || substr(html_strip_tags(blog_entry.body), 0, 1000) as text,
blog_entry.created
from
blog_entry join blog_entry_fts on blog_entry.rowid = blog_entry_fts.rowid
where
blog_entry_fts match escape_fts(:q)
order by
blog_entry_fts.rank
limit
1
""".strip(),
"_shape": "array",
"q": q,
})
return response.json()[0]["text"]
Action: Calculation
The calculation action allows the AI agent to perform mathematical calculations. Here’s how to implement it:
def calculate(what):
return eval(what)
Adding Actions to the AI Agent
Next, we need to register these actions in a dictionary so the AI agent can use them:
known_actions = {
"wikipedia": wikipedia,
"calculate": calculate,
"simon_blog_search": simon_blog_search
}
Integrating Actions with the AI Agent
To integrate the actions with the AI agent, we need to ensure that the query function can handle the different actions and feed the observations back into the reasoning loop. Here’s how to complete the integration:
def query(question, max_turns=5):
i = 0
bot = ChatBot(prompt)
next_prompt = question
while i < max_turns:
i += 1
result = bot(next_prompt)
print(result)
actions = [action_re.match(a) for a in result.split('n') if action_re.match(a)]
if actions:
action, action_input = actions[0].groups()
if action not in known_actions:
raise Exception(f"Unknown action: {action}: {action_input}")
print(" -- running {} {}".format(action, action_input))
observation = known_actions[action](action_input)
print("Observation:", observation)
next_prompt = f"Observation: {observation}"
else:
return result
With this setup, the AI agent can reason about the input, perform actions, observe the results, and generate responses.
Testing and Debugging
Let us now follow the steps for testing and debugging.
Running Sample Queries
To test the AI agent, you can run sample queries and observe the results. Here are a few examples:
print(query("What does England share borders with?"))
print(query("Has Simon been to Madagascar?"))
print(query("Fifteen * twenty five"))
Debugging Common Issues
While testing, you might encounter some common issues. Here are a few tips to debug them:
- API Errors: Ensure your API keys are correctly set and have the necessary permissions.
- Network Issues: Check your internet connection and ensure the endpoints you are calling are reachable.
- Incorrect Outputs: Verify the logic in your action functions and ensure they return the correct results.
- Unhandled Actions: Make sure all possible actions are defined in the known_actions dictionary.
Improving the AI Agent
Let us now improve AI agents.
Enhancing Robustness and Security
To make the AI agent more robust and secure:
- Validate Inputs: Ensure all inputs are properly validated to prevent injection attacks, especially in the calculate function.
- Error Handling: Implement error handling in the action functions to gracefully manage exceptions.
- Logging: Add logging to track the agent’s actions and observations for easier debugging.
Adding More Actions and Capabilities
To enhance the AI agent’s capabilities, you can add more actions such as:
- Weather Information: Integrate with a weather API to fetch real-time weather data.
- News Search: Implement a news search action to fetch the latest news articles.
- Translation: Add a translation action using a translation API to support multilingual queries.
Real-World Applications
- Customer Support: AI agents can handle customer inquiries, resolve issues, and provide personalized recommendations.
- Healthcare: AI agents assist in diagnosing diseases, recommending treatments, and monitoring patient health.
- Finance: AI agents detect fraud, execute trades, and provide financial advice.
- Marketing: AI agents personalize marketing campaigns, segment audiences, and optimize ad spend.
Future Prospects and Advancements
The future of AI agents is promising, with advancements in machine learning, natural language processing, and AI ethics. Emerging trends include:
- Autonomous Systems: More sophisticated autonomous systems capable of handling complex tasks.
- Human-AI Collaboration: Enhanced collaboration between humans and AI agents for improved decision-making.
- Ethical AI: Focus on developing ethical AI agents that prioritize privacy, fairness, and transparency.
Conclusion
In this comprehensive guide, we explored the concept of AI agents, their significance, and the ReAct pattern that enhances their capabilities. We covered the necessary tools and libraries, set up the environment, and walked through building an AI agent from scratch. We also discussed implementing actions, integrating them with the AI agent, and testing and debugging the system. Finally, we looked at real-world applications and future prospects of AI agents.
By following this guide, you now have the knowledge to create your own build AI agents from scratch. Experiment with different actions, enhance the agent’s capabilities, and explore new possibilities in the exciting field of artificial intelligence.
Key Takeaways
- Understanding the core concepts and significance of AI agents.
- Implementation of the ReAct pattern to allow AI agents to perform actions and reason about their observations.
- Knowledge of the essential tools and libraries like OpenAI API, httpx, and Python regular expressions.
- A detailed guide on building an AI agent from scratch, including defining actions and integrating them.
- Techniques for effectively testing and debugging AI agents.
- Strategies to enhance the AI agent’s capabilities and ensure its robustness and security.
- Practical examples of how AI agents are used in various industries and their future advancements.
Frequently Asked Questions
A. The ReAct pattern (Reason + Act) involves implementing additional actions that an AI agent can take, like searching Wikipedia or running calculations, and teaching the agent to request these actions and process their results.
A. Essential tools and libraries include Python, OpenAI API, httpx for HTTP requests, and Python’s regular expressions (re) library.
A. Validate inputs thoroughly to prevent injection attacks, use sandboxing techniques where possible, implement error handling, and log actions for monitoring and debugging.
A. Yes, you can add various actions such as fetching weather information, searching for news articles, or translating text using appropriate APIs and integrating them into the AI agent’s reasoning loop
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.
By Analytics Vidhya, July 10, 2024.