In this post, we will be exploring Langchain Agents and their tools that can be used to complete a task. An example of a task could be to find some nested complex answers like:
Find the ratio of women to men CEO in the Fortune 500 companies.
The above is a complex task because it requires a chain of operations to be performed to come up with the desired answer. The chain of operations could be as follows:
Finding the list of Fortune 500 companies.
Finding out the CEOs of those 500 companies and their gender.
Then performing the count of the male and the female CEOs. And then performing the division arithmetic operation to find the ratio.
What is an Agent?
An Agent helps in making a chain of calls to LLM. An agent has access to a suite of tools that can be used according to the task that needs to be accomplished. An Agent takes an action, makes an observation about what it has done, and continues in this manner until it can complete its task.
Initializing an Agent
In Langchain for initializing an agent we use the initialize_agent
function. We need 2 mandatory fields for initializing an agent.
Tools
LLM Model
Apart from the above 2 tools we also have the provision to send the optional fields to an Agent like:
Agent type
BaseCallbackManager
Agent path
Tags
Dictionary
Following is the Python code for initializing an agent.
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
What are Agent Tools available?
Tools are interfaces that an agent can use to interact with the world. Langchain provides a variety of tools to integrate with:
Apify: Apify is a cloud platform for web scraping and data extraction.
Brave Search: For performing a search in the Brave Search engine.
Twilio: Messaging channel that can be used to send messages through WhatsApp, Facebook Messenger, etc.
Wikipedia: To search for any information in Wikipedia.
Tens of tools like that are ready for integration in Langchain and new tools get added to the list frequently.
Agents In Action
In this blog, we will be playing around with 3 tools DuckDuckGo, SERP API, and llm-math
tool. DuckDuckGo and SERP API will be used to search for information on the web. A llm-math
tool is a tool that allows Agent to perform mathematical calculations using LLM (Language Model).
For executing an agent to perform some task we need to use the agent.run
function. In the argument of the function, we need to pass the task to the Agent.
agent.run("Why did the USA housing market crashed in 2008?")
DuckDuckGo Search
In this example I want the agent to tell me about the weather in France. So I pass this instruction to the agent: "What is the weather in France?"
import os
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType
from langchain.llms import OpenAI
os.environ["OPENAI_API_KEY"] = "<openai-api-key>"
llm = OpenAI(temperature=0.1)
tools = load_tools(["ddg-search"], llm=llm)
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
agent.run("What is the weather in France?")
Let's have a look at the prompt the agent is sending to OpenAI:
Answer the following questions as best you can. You have access to the following tools:
duckduckgo_search: A wrapper around DuckDuckGo Search. Useful for when you need to answer questions about current events. Input should be a search query.
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [duckduckgo_search]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
Question: {input}
Thought:{agent_scratchpad}
As per the prompt, we have first defined the tool duckduckgo_search, so that the LLM understands for which purpose the tool should be used. Then a series of directions have been provided to the LLM which includes the following instructions:
The LLM needs to answer a question.
The LLM should figure out how to reach the answer.
The LLM should select and use the best tool at its disposal to find the answer.
The LLM should decide what input should be passed to the tool.
The LLM should observe the output of the tool and see if it got its answer or not.
If the LLM still doesn't have the answer then it should repeat the steps from 2 to 5 until it gets the answer to the question.
Once the LLM gets the answer to the question then it should return the answer and exit.
When we run our agent it returns the following answer by performing the steps defined above. Below is the screenshot for the same.
Following are the steps the agent takes:
The agent first understands the task and uses the tool to perform the task. In this case, we have provided only one tool so it uses the DuckDuckGo Search to accomplish this task.
Next, it searches for "weather in France" in DuckDuckGo Search and fetches the response.
The fetched response is sent to the LLM to answer the question. The LLM was able to find the desired answer in the first go so the Agent stopped the chain and returned the output.
One thinking to notice here is since the search result was not good, the LLM couldn't show us the best answer. From the result, it was able to figure out that the average annual temperature in France is 50 degrees Fahrenheit.
SERP API
Now we will perform a similar search exercise but we will use the SERP API tool. The code will remain the same, the only change will be the tool name.
import os
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType
from langchain.llms import OpenAI
os.environ["OPENAI_API_KEY"] = "openai-api-key"
os.environ["SERPAPI_API_KEY"] = "serpapi-api-key"
llm = OpenAI(temperature=0.1)
tools = load_tools(["serpapi"], llm=llm)
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
agent.run("What is the weather in France?")
Let's have a look at the prompt the agent is sending to OpenAI:
Answer the following questions as best you can. You have access to the following tools:
Search: A search engine. Useful for when you need to answer questions about current events. Input should be a search query.
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [Search]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
Question: {input}
Thought:{agent_scratchpad}
The prompt we are sending to LLM is almost the same as the previous one, the only difference is the tool definition.
The Agent again does a similar operation, here is the screenshot of the agent logs.
In the above example, the Agent is able to get the answer in the first shot, lets's look at another example where multiple searches were made to get the answer to the question.
agent.run("What are the different types of chocolate available?")
Here the Agent first got only the count for the different types of chocolate from Search Engine so it made a search again to fetch the name of the chocolates.
Performing complex operations using Agents with multiple tools
In this example, we will be performing a chain operation which will first require the Agent to search for some information and then perform certain mathematical operations on that data. The task for the Agent is to "Try to calculate the ratio of the distance of Earth and the sun to the distance of Saturn and the sun?"
import os
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType
from langchain.llms import OpenAI
os.environ["OPENAI_API_KEY"] = "openai-api-key"
os.environ["SERPAPI_API_KEY"] = "serpapi-api-key"
llm = OpenAI(temperature=0.1)
tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
agent.run("Try to calculate the ratio of distance of earth and sun to the distance of saturn and sun?")
For the search part we have used the SERP API tool and for the arithmetic part we have used the llm-math tool.
The Agent calculates the ratio by performing the following operations:
The agent first understands the task that needs to be performed and then selects the tool that needs to be used to perform the task. For the search operation, it has used the SERP API.
Then it searches the distance between Earth and Sun.
Then it searches the distance between Saturn and Sun.
Then it uses the llm-math tool to perform the arithmetic operation.
In this post, I have shared a few simple examples of how to use Agents in Langchain to perform a task. However, the potential applications of Agents go far beyond what I've covered here. With the active and thriving community of Langchain and more and more new tools at disposal, the future holds even more exciting prospects for the capabilities of Agents. So, start exploring the world of Agents in Langchain today and unlock a new realm of possibilities for your organization or project.
If you're in the Langchain space or LLM domain, let's connect on Linkedin! I'd love to stay connected and continue the conversation. Reach me at: linkedin.com/in/ritobrotoseth