Conversations as Directed Graphs with LangChain
Building a chatbot designed to understand key information about new prospective customers.
In this post we’ll use LangChain to do lead qualification in a real-estate context. We imagine a scenario where new potential customers contact a real-estate agent for the first time. We’ll design a system which communicates with a new prospective lead to extract key information before the real-estate agent takes over.
Who is this useful for? Anyone interested in applying natural language processing (NLP) in a practical context.
How advanced is this post? This example is conceptually straightforward, but you might struggle to follow along if you don’t have a firm grasp of Python and a general understanding of language models
Prerequisites: Fundamental programming knowledge in Python, and a high level understanding of language models.
A Description of the Problem
This use case is directly inspired by a work request I received while operating as a contractor. The prospective client owned a real-estate company, and found that a significant amount of their agent’s time was spent performing the same repetitive task at the beginning of each conversation: lead qualification.
Lead qualification is the real-estate term for the first pass at a lead. Getting their contact information, their budget, etc. It’s a pretty broad term, and the details can fluctuate from organization to organization. For this post, we’ll consider extracting the following information as “qualifying” a lead:
Name: the name of the lead.
Contact Info: The email or phone number of the lead.
Financing: Their budget to rent monthly.
Readiness: How quickly can they meet with an agent.
The Approach
The naive approach
While large language models are incredibly powerful, they need proper contextualization of the use case to be consistently successful. You could, for instance, give a language model a prompt saying something like:
"You are a real-estate agent trying to qualify a new client.
Extract the following information:
- email
- phone
....
Once all information has been extracted from the client, politely
thank them you will be re-directing them to an agent"
Then, you could put your new client in a chat room with a model initialized with that prompt. This would be a great way to start experimenting with an LLM in a particular business context, but is also a great way to begin realizing how fragile LLMs are to certain types of feedback. The conversation could quickly derail if a user asked a benign but irrelevant question like “Did you catch the game last night?” or “Yeah, I was walking down the road and I saw your complex on second.” This may or may not be a serious issue depending on the use case, but imposing a rigid structure around the conversation can help keep things on track.
Conversations as Directed Graphs
We can frame a conversation as a directed graph, where each node represents a certain conversational state, and each edge represents an impetus to change the conversational state, like a completed introduction or acquired piece of information.
This is just about the most fundamental directed graph we could compose for this problem. It’s worth noting that this approach can easily grow, shrink, or otherwise change based on the needs of the system.
For instance, if your clients consistently ask the chatbot about sports, which was unanticipated in the initial design phase, then you can add the relevant logic to check for this type of question and respond appropriately.
When creating a new system which interacts with humans in an organic way it’s vital for it to be easily iterated on as new and unexpected issues arise. We’ll keep it simple for the purposes of this example, but extensibility is one of the core abilities of this approach.
Key Technologies
We’ll be using LangChain to do most of the heavy lifting. Specifically, we’ll be using:
an LLM: We’ll be using OpenAI’s Text DaVinci 3 model.
Output Parsing: We’ll be using LangChain’s Pydantic parser to parse results into easy to consume formats.
We’ll also be implementing a Directed Graph from scratch, with some functionality baked into that graph to achieve the desired functionality.
The Model
In this example we’re using OpenAI’s Text Davinci 3 model. While you could use almost any modern large language model, I chose to use this particular model because it’s widely used in LangChain examples and documentation.
LangChain does its best to be a robust and resilient framework, but working with large language models is fiddly work. Different models can behave drastically differently to a given prompt. I found that Text Davinci 3 behaved consistently with prompts from LangChain.
LangChain allows you to use self-hosted models, models hosted for free on Hugging Face, or models from numerous other sources. Feel free to experiment with your choice of model; it’s pretty easy to swap between them (though, in my experience, you will probably have to adjust your prompts to the particular model you’re using).
Text Davinci 3 is a transformer model, feel free to read the following article for more information:
LangChain Parsing
LangChain has a variety of parsers designed to be used with large language models. We’ll be using the PydanticOutputParser.
LangChain parsers not only extract key information from LLM responses, but also modify prompts to entice more parsable responses from the LLM. With the Pydantic parser you first define a class representing the format of the results you want from the LLM. Let’s say you want to get a joke, complete with setup and punchline, from an LLM:
""" Define the data structure we want to be parsed out from the LLM response
notice that the class contains a setup (a string) and a punchline (a string.
The descriptions are used to construct the prompt to the llm. This particular
example also has a validator which checks if the setup contains a question mark.
from: https://python.langchain.com/docs/modules/model_io/output_parsers/pydantic
"""
class Joke(BaseModel):
setup: str = Field(description="question to set up a joke")
punchline: str = Field(description="answer to resolve the joke")
@validator("setup")
def question_ends_with_question_mark(cls, field):
if field[-1] != "?":
raise ValueError("Badly formed question!")
return field
you can then define the actual query you want to send to the model.
"""Defining the query from the user
"""
joke_query = "Tell me a joke about parrots"
This query then gets modified by the parser, combining the user’s query and information about the final parsing format to construct the prompt to the llm.
"""Defining the prompt to the llm
from: https://python.langchain.com/docs/modules/model_io/output_parsers/pydantic
"""
parser = PydanticOutputParser(pydantic_object=Joke)
prompt = PromptTemplate(
template="Answer the user query.\n{format_instructions}\n{query}\n",
input_variables=["query"],
partial_variables={"format_instructions": parser.get_format_instructions()},
)
input = prompt.format_prompt(query=joke_query)
print(input.text)
The prompt for this particular example is the following:
Answer the user query.
The output should be formatted as a JSON instance that conforms to the JSON schema below.
As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.
Here is the output schema:
```
{"properties": {"setup": {"title": "Setup", "description": "question to set up a joke", "type": "string"}, "punchline": {"title": "Punchline", "description": "answer to resolve the joke", "type": "string"}}, "required": ["setup", "punchline"]}
```
Tell me a joke about parrots
Notice how the query from the user “Tell me a joke about parrots” is combined with information about the desired end format.
This formatted query can then be passed to the model, and the parser can be used to extract the result:
"""Declaring a model and querying it with the parser defined input
"""
model_name = "text-davinci-003"
temperature = 0.0
model = OpenAI(model_name=model_name, temperature=temperature)
output = model(input.to_string())
parser.parse(output)
Here’s the result from this particular example:
"""The final output, a Joke object with a setup and punchline attribute
"""
Joke(setup="Why don't parrots make good detectives?",
punchline="Because they're always repeating themselves!")
The PydanticOutputParser is both powerful and flexible, which is why it’s the most commonly used parser in LangChain. We’ll be exploring this parser more throughout this post. The OutputFixingParser and RetryOutputParser are two other very useful output parsers which will not be explored in this post, but certainly could be used in this use case.
Conversations as a Directed Graph
We’ll be abstracting a conversation into a directed graph.
Each node and edge will need to be customized, but will follow the same general structure:
It’s worth noting that LangChain has a similar structure, called a Chain. We won't be discussing Chains in this post, but they are useful for direct and sequential LLM tasks.
Defining Nodes and Edges
This is where we start coding up an LLM supported directed graph with the core aforementioned structure. We’ll be using Pydantic parsers for both the input validation step as well as the actual content parsing.
I’m including the code for reference, but don’t be daunted by the length. You can skim through the code, or not refer to the code at all if you don’t want to. The final notebook can be found here:
Google Colaboratory
Edit descriptioncolab.research.google.com
General Utilities
For demonstrative purposes, all of this will exist within a single Jupyter notebook, and the final back and forth between the model will be executed in the final cell. In order to improve readability, we’ll define three functions: one for model output to the user, one for user input to the model, and another for printing key information for demonstration, like the results of parsing.
"""Defining utility functions for constructing a readable exchange
"""
def system_output(output):
"""Function for printing out to the user
"""
print('======= Bot =======')
print(output)
def user_input():
"""Function for getting user input
"""
print('======= Human Input =======')
return input()
def parsing_info(output):
"""Function for printing out key info
"""
print(f'*Info* {output}')
Defining the Edge
As the code suggests, an edge takes some input, checks it against a condition, and then parses the input if the condition was met. The edge contains the relevant logic for recording the number of times it’s been attempted and failed, and is responsible for telling higher level units whether we should progress through the directed graph along the edge or not.
from typing import List
class Edge:
"""Edge
at its highest level, an edge checks if an input is good, then parses
data out of that input if it is good
"""
def __init__(self, condition, parse_prompt, parse_class, llm, max_retrys=3, out_node=None):
"""
condition (str): a True/False question about the input
parse_query (str): what the parser whould be extracting
parse_class (Pydantic BaseModel): the structure of the parse
llm (LangChain LLM): the large language model being used
"""
self.condition = condition
self.parse_prompt = parse_prompt
self.parse_class = parse_class
self.llm = llm
#how many times the edge has failed, for any reason, for deciding to skip
#when successful this resets to 0 for posterity.
self.num_fails = 0
#how many retrys are acceptable
self.max_retrys = max_retrys
#the node the edge directs towards
self.out_node = out_node
def check(self, input):
"""ask the llm if the input satisfies the condition
"""
validation_query = f'following the output schema, does the input satisfy the condition?\ninput:{input}\ncondition:{self.condition}'
class Validation(BaseModel):
is_valid: bool = Field(description="if the condition is satisfied")
parser = PydanticOutputParser(pydantic_object=Validation)
input = f"Answer the user query.\n{parser.get_format_instructions()}\n{validation_query}\n"
return parser.parse(self.llm(input)).is_valid
def parse(self, input):
"""ask the llm to parse the parse_class, based on the parse_prompt, from the input
"""
parse_query = f'{self.parse_prompt}:\n\n"{input}"'
parser = PydanticOutputParser(pydantic_object=self.parse_class)
input = f"Answer the user query.\n{parser.get_format_instructions()}\n{parse_query}\n"
return parser.parse(self.llm(input))
def execute(self, input):
"""Executes the entire edge
returns a dictionary:
{
continue: bool, weather or not should continue to next
result: parse_class, the parsed result, if applicable
num_fails: int the number of failed attempts
}
"""
#input did't make it past the input condition for the edge
if not self.check(input):
self.num_fails += 1
if self.num_fails >= self.max_retrys:
return {'continue': True, 'result': None, 'num_fails': self.num_fails}
return {'continue': False, 'result': None, 'num_fails': self.num_fails}
try:
#attempting to parse
self.num_fails = 0
return {'continue': True, 'result': self.parse(input), 'num_fails': self.num_fails}
except:
#there was some error in parsing.
#note, using the retry or correction parser here might be a good idea
self.num_fails += 1
if self.num_fails >= self.max_retrys:
return {'continue': True, 'result': None, 'num_fails': self.num_fails}
return {'continue': False, 'result': None, 'num_fails': self.num_fails}
I created a few unit tests in the code here which illustrate how the edge functions.
Defining the Node
Now that we have an Edge, which handles input validation and parsing, we can define a Node, which handles conversational state. The Node requests a user for input, and passes that input to the directed edges coming from that Node. If none of the edges execute successfully, the Node asks the user for the input again.
class Node:
"""Node
at its highest level, a node asks a user for some input, and trys
that input on all edges. It also manages and executes all
the edges it contains
"""
def __init__(self, prompt, retry_prompt):
"""
prompt (str): what to ask the user
retry_prompt (str): what to ask the user if all edges fail
parse_class (Pydantic BaseModel): the structure of the parse
llm (LangChain LLM): the large language model being used
"""
self.prompt = prompt
self.retry_prompt = retry_prompt
self.edges = []
def run_to_continue(self, _input):
"""Run all edges until one continues
returns the result of the continuing edge, or None
"""
for edge in self.edges:
res = edge.execute(_input)
if res['continue']: return res
return None
def execute(self):
"""Handles the current conversational state
prompots the user, tries again, runs edges, etc.
returns the result from an adge
"""
#initial prompt for the conversational state
system_output(self.prompt)
while True:
#getting users input
_input = user_input()
#running through edges
res = self.run_to_continue(_input)
if res is not None:
#parse successful
parsing_info(f'parse results: {res}')
return res
#unsuccessful, prompting retry
system_output(self.retry_prompt)
With this implemented, we can begin seeing conversations take place. We’ll implement a Node which requests contact information, and two edges: one which attempts to parse out a valid email, and one that attempts to parse out a valid phone number.
"""Defining an example
this example asks for contact information, and parses out either an email
or a phone number.
"""
#defining the model used in this test
model_name = "text-davinci-003"
temperature = 0.0
model = OpenAI(model_name=model_name, temperature=temperature)
#Defining 2 edges from the node
class sampleOutputTemplate(BaseModel):
output: str = Field(description="contact information")
condition1 = "Does the input contain a full and valid email?"
parse_prompt1 = "extract the email from the following text."
edge1 = Edge(condition1, parse_prompt1, sampleOutputTemplate, model)
condition2 = "Does the input contain a full and valid phone number (xxx-xxx-xxxx or xxxxxxxxxx)?"
parse_prompt2 = "extract the phone number from the following text."
edge2 = Edge(condition2, parse_prompt2, sampleOutputTemplate, model)
#Defining A Node
test_node = Node(prompt = "Please input your full email address or phone number",
retry_prompt = "I'm sorry, I didn't understand your response.\nPlease provide a full email address or phone number(in the format xxx-xxx-xxxx)")
#Defining Connections
test_node.edges = [edge1, edge2]
#running node. This handles all i/o and the logic to re-ask on failure.
res = test_node.execute()
Here’s a few examples of conversations with this single node:
Example 1)
======= Bot =======
Please input your full email address or phone number
======= Human Input =======
input: Hey, yeah I'm so excited to rent from you guys. My email is hire@danielwarfield.dev
*Info* parse results: {'continue': True, 'result': sampleOutputTemplate(output='hire@danielwarfield.dev'), 'num_fails': 0, 'continue_to': None}
Example 2)
======= Bot =======
Please input your full email address or phone number
======= Human Input =======
input: do you want mine or my wifes?
======= Bot =======
I'm sorry, I didn't understand your response.
Please provide a full email address or phone number(in the format xxx-xxx-xxxx)
======= Human Input =======
input: ok, I guess you want mine. 413-123-1234
*Info* parse results: {'continue': True, 'result': sampleOutputTemplate(output='413-123-1234'), 'num_fails': 0, 'continue_to': None}
Example 3)
======= Bot =======
Please input your full email address or phone number
======= Human Input =======
input: No
======= Bot =======
I'm sorry, I didn't understand your response.
Please provide a full email address or phone number(in the format xxx-xxx-xxxx)
======= Human Input =======
input: nope
======= Bot =======
I'm sorry, I didn't understand your response.
Please provide a full email address or phone number(in the format xxx-xxx-xxxx)
======= Human Input =======
input: I said no
*Info* parse results: {'continue': True, 'result': None, 'num_fails': 3, 'continue_to': None}
In example 1 the user includes some irrelevant information, but has a valid email in the response. In example 2 the user does not have a valid email or phone number in the first response, but does have one in the second. In example 3 the user has no valid responses, and one of the edges gives up and allows the conversation to progress.
It’s worth noting, from a user feel perspective, this approach feels a bit robotic. While not explored in this post, it’s easy to imagine how the user input could be used to construct the systems output to the user, either through string formatting or by asking an LLM to format a response.
Defining the Conversation
Now that we have Nodes and Edges, and have defined their functionality, we can put it all together to create the final conversation. We covered a general blueprint previously, but let’s brush it up to be more reflective of what the graph will actually be doing. Recall the following:
Nodes have an initial prompt and a retry prompt
Edges have a condition, a parsing prompt, and a parsing structure. The condition is a boolean question asked about the users input. If the condition is satisfied, the parsing structure is parsed based on the parsing prompt and the users input. This is done by asking the large language model to reformat the users input into a parsable representation using the pydantic parser.
Lets construct a conversational graph based on these definitions:
As can be seen in the diagram above, some prompt engineering has been done to accommodate certain edge cases. For instance, the parsing prompt for Budget allows the parser to parse user responses like “my budget is around 1.5k”.
Because of the flexibility of LLMs, it’s really up to the engineer exactly how a graph like this might be implemented. if price parsing proves to be an issue in the future, one might have a few edges, each with different conditions and parsing prompts. For instance, one could imagine an edge that checks if a budget is over a certain value, thus implying that they’re providing a yearly budget instead of a monthly budget. The power of this system is for the seamless addition or removal of these modifications.
Implementing the Conversational Graph
We’ve already done all the heavy lifting, now we just need to code it up and see how it works. Here’s the implementation:
"""Implementing the conversation as a directed graph
"""
# Defining Nodes
name_node = Node("Hello! My name's Dana and I'll be getting you started on your renting journey. I'll be asking you a few questions, and then forwarding you to one of our excellent agents to help you find a place you'd love to call home.\n\nFirst, can you please provide your name?", "I'm sorry, I don't understand, can you provide just your name?")
contact_node = Node("do you have a phone number or email we can use to contact you?", "I'm sorry, I didn't understand that. Can you please provide a valid email or phone number?")
budget_node = Node("What is your monthly budget for rent?", "I'm sorry, I don't understand the rent you provided. Try providing your rent in a format like '$1,300'")
avail_node = Node("Great, When is your soonest availability?", "I'm sorry, one more time, can you please provide a date you're willing to meet?")
#Defining Data Structures for Parsing
class nameTemplate(BaseModel): output: str = Field(description="a persons name")
class phoneTemplate(BaseModel): output: str = Field(description="phone number")
class emailTemplate(BaseModel): output: str = Field(description="email address")
class budgetTemplate(BaseModel): output: float = Field(description="budget")
class dateTemplate(BaseModel): output: str = Field(description="date")
#defining the model
model_name = "text-davinci-003"
temperature = 0.0
model = OpenAI(model_name=model_name, temperature=temperature)
#Defining Edges
name_edge = Edge("Does the input contain a persons name?", " Extract the persons name from the following text.", nameTemplate, model)
contact_phone_edge = Edge("does the input contain a valid phone number?", "extract the phone number in the format xxx-xxx-xxxx", phoneTemplate, model)
contact_email_edge = Edge("does the input contain a valid email?", "extract the email from the following text", emailTemplate, model)
budget_edge = Edge("Does the input contain a number in the thousands?", "Extract the number from the following text from the following text. Remove any symbols and multiply a number followed by the letter 'k' to thousands.", budgetTemplate, model)
avail_edge = Edge("does the input contain a date or day? dates or relative terms like 'tommorrow' or 'in 2 days'.", "extract the day discussed in the following text as a date in mm/dd/yyyy format. Today is September 23rd 2023.", dateTemplate, model)
#Defining Node Connections
name_node.edges = [name_edge]
contact_node.edges = [contact_phone_edge, contact_email_edge]
budget_node.edges = [budget_edge]
avail_node.edges = [avail_edge]
#defining edge connections
name_edge.out_node = contact_node
contact_phone_edge.out_node = budget_node
contact_email_edge.out_node = budget_node
budget_edge.out_node = avail_node
#running the graph
current_node = name_node
while current_node is not None:
res = current_node.execute()
if res['continue']:
current_node = res['continue_to']
And here are a few example conversations:
======= Bot =======
Hello! My name's Dana and I'll be getting you started on your renting journey. I'll be asking you a few questions, and then forwarding you to one of our excellent agents to help you find a place you'd love to call home.
First, can you please provide your name?
======= Human Input =======
input: daniel warfield
*Info* parse results: {'continue': True, 'result': nameTemplate(output='daniel warfield'), 'num_fails': 0, 'continue_to': <__main__.Node object at 0x7b196801dc60>}
======= Bot =======
do you have a phone number or email we can use to contact you?
======= Human Input =======
input: 4131231234
======= Bot =======
I'm sorry, I didn't understand that. Can you please provide a valid email or phone number?
======= Human Input =======
input: my phone number is 4131231234
*Info* parse results: {'continue': True, 'result': phoneTemplate(output='413-123-1234'), 'num_fails': 0, 'continue_to': <__main__.Node object at 0x7b196801c610>}
======= Bot =======
What is your monthly budget for rent?
======= Human Input =======
input: 1.5k
*Info* parse results: {'continue': True, 'result': budgetTemplate(output=1500.0), 'num_fails': 0, 'continue_to': <__main__.Node object at 0x7b196801c7c0>}
======= Bot =======
Great, When is your soonest availability?
======= Human Input =======
input: 2 days
*Info* parse results: {'continue': True, 'result': dateTemplate(output='09/25/2023'), 'num_fails': 0, 'continue_to': None}
======= Bot =======
Hello! My name's Dana and I'll be getting you started on your renting journey. I'll be asking you a few questions, and then forwarding you to one of our excellent agents to help you find a place you'd love to call home.
First, can you please provide your name?
======= Human Input =======
input: Hi Dana, my name's mike (michael mcfoil), it's a pleasure to meet you!
*Info* parse results: {'continue': True, 'result': nameTemplate(output='Michael Mcfoil'), 'num_fails': 0, 'continue_to': <__main__.Node object at 0x7b19681087c0>}
======= Bot =======
do you have a phone number or email we can use to contact you?
======= Human Input =======
input: yeah, you can reach me at mike at gmail
======= Bot =======
I'm sorry, I didn't understand that. Can you please provide a valid email or phone number?
======= Human Input =======
input: oh, sorry ok it's mike@gmail.com
*Info* parse results: {'continue': True, 'result': emailTemplate(output='mike@gmail.com'), 'num_fails': 0, 'continue_to': <__main__.Node object at 0x7b1968109960>}
======= Bot =======
What is your monthly budget for rent?
======= Human Input =======
input: I can do anywhere from 2 thousand to 5 thousand, depending on the property
*Info* parse results: {'continue': True, 'result': budgetTemplate(output=5000.0), 'num_fails': 0, 'continue_to': <__main__.Node object at 0x7b196810a260>}
======= Bot =======
Great, When is your soonest availability?
======= Human Input =======
input: does october 2nd work for you?
======= Bot =======
I'm sorry, one more time, can you please provide a date you're willing to meet?
======= Human Input =======
input: october 2nd
*Info* parse results: {'continue': True, 'result': dateTemplate(output='10/02/2023'), 'num_fails': 0, 'continue_to': None}
======= Bot =======
Hello! My name's Dana and I'll be getting you started on your renting journey. I'll be asking you a few questions, and then forwarding you to one of our excellent agents to help you find a place you'd love to call home.
First, can you please provide your name?
======= Human Input =======
input: je m'appelle daniel warfield
*Info* parse results: {'continue': True, 'result': nameTemplate(output='Daniel Warfield'), 'num_fails': 0, 'continue_to': <__main__.Node object at 0x7b196801c7c0>}
======= Bot =======
do you have a phone number or email we can use to contact you?
======= Human Input =======
input: mi número de teléfono es 410-123-1234
*Info* parse results: {'continue': True, 'result': phoneTemplate(output='410-123-1234'), 'num_fails': 0, 'continue_to': <__main__.Node object at 0x7b196801ec20>}
======= Bot =======
What is your monthly budget for rent?
======= Human Input =======
input: Mein monatliches Budget beträgt 3.000
*Info* parse results: {'continue': True, 'result': budgetTemplate(output=3000.0), 'num_fails': 0, 'continue_to': <__main__.Node object at 0x7b196801d390>}
======= Bot =======
Great, When is your soonest availability?
======= Human Input =======
input: אני יכול להיפגש מחר
======= Bot =======
I'm sorry, one more time, can you please provide a date you're willing to meet?
======= Human Input =======
input: Yes karogh yem handipel vaghy
======= Bot =======
I'm sorry, one more time, can you please provide a date you're willing to meet?
======= Human Input =======
input: I can meet tomorrow
*Info* parse results: {'continue': True, 'result': dateTemplate(output='09/24/2023'), 'num_fails': 0, 'continue_to': None}
Conclusion
In this article we formatted a lead qualification use case as a directed graph, implemented the necessary parsing functionality and data structures, and made an example graph which extracts key information from users. As can be seen in the example conversations this system is by no means perfect, but because of the nature of directed graphs we can easily add new nodes to alleviate the impact of certain edge cases.
While not discussed in this article, there’s a lot of ways to improve upon this system:
We could use different LangChain parsers to attempt to re-try or correct queries.
We could use an LLM Cache to try to cache certain common responses, thus saving on budget.
We could connect this system with a vector database to allow question answering against a knowledge base.
We could use the LLM to construct the prompts to the user, along with context about the conversation, to encourage more organic responses.
While my contracting gig didn’t pan out, I think this approach highlights a flexible and robust framework which is extensible and applicable to a variety of applications.