The Challenge of Integrating LLMs with Structured Systems
One of the major advantages of large language models (LLMs) compared to traditional API services when developing applications is their ability to generate natural language output that better aligns with human communication patterns. However, this poses a challenge for system integration, as interactions between systems are usually structured.
To address this, we need LLMs to output content in a structured format like JSON to facilitate subsequent processing. The typical approach is to provide formatting requirements (ideally with examples) in the prompt to constrain the output, but this is not 100% effective and can impact application stability.
Solutions for Handling Edge Cases
To handle potential edge cases, Microsoft has introduced technologies like Guidance and TypeChat. These tools help bridge the gap between the flexible outputs of LLMs and the structured inputs expected by traditional systems. For more details, read the article “The Arrival of a New Interaction Paradigm: LUI (Language User Interface), and Microsoft’s Bridge to It”.
Outlines: A Python Library for Structured LLM Output
Outlines is a Python library that provides functionality for managing and optimizing LLM prompts and outputs. It offers several methods for structuring LLM-generated content to make it more compatible with traditional systems:
Choice Generation
Outlines allows you to specify a set of valid choices for the LLM output:
import outlines
model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
prompt = """You are a sentiment-labelling assistant.
Is the following review positive or negative?
Review: This restaurant is just awesome!
"""
generator = outlines.generate.choice(model, ["Positive", "Negative"])
answer = generator(prompt)
Type Constraints
You can constrain the LLM output to a specific Python type like int:
from outlines import models, generate
model = models.transformers("mistralai/Mistral-7B-v0.1")
generator = generate.format(model, int)
answer = generator("When I was 6 my sister was half my age. Now I'm 70 how old is my sister?")
print(answer)
# 67
Regular Expressions
Outlines enables you to define a regular expression pattern that the LLM output must match:
import outlines
model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
prompt = "What is the IP address of the Google DNS servers? "
generator = outlines.generate.regex(
model,
r"((25[0-5]|2[0-4]d|[01]?dd?).){3}(25[0-5]|2[0-4]d|[01]?dd?)",
)
structured = generator(prompt, max_tokens=30)
print(structured)
# What is the IP address of the Google DNS servers?
# 2.2.6.1
JSON Models
You can specify a Pydantic model to define the expected JSON structure of the LLM output:
from pydantic import BaseModel
from outlines import models, generate
class User(BaseModel):
name: str
last_name: str
id: int
model = models.transformers("mistralai/Mistral-7B-v0.1")
generator = generate.json(model, User)
result = generator(
"Create a user profile with the fields name, last_name and id"
)
print(result)
# User(name="John", last_name="Doe", id=11)
Prompt Templates
Outlines also provides prompt templating functionality to manage and optimize prompts in a way that better aligns with Python programming conventions:
import outlines
examples = [
("The food was disgusting", "Negative"),
("We had a fantastic night", "Positive"),
("Recommended", "Positive"),
("The waiter was rude", "Negative")
]
@outlines.prompt
def labelling(to_label, examples):
"""You are a sentiment-labelling assistant.
{% for example in examples %}
{{ example[0] }} // {{ example[1] }}
{% endfor %}
{{ to_label }} //
"""
model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
prompt = labelling("Just awesome", examples)
answer = outlines.generate.text(model)(prompt, max_tokens=100)
Conclusion
As demonstrated, Outlines is straightforward to use and well-suited for Python developers looking to format LLM outputs for better integration with structured systems. Notably, the structured generation methods in Outlines incur almost no additional cost during inference and do not take more time than generating unstructured text from the model. In fact, its inference speed is 5 times faster than regular open-ended text generation.
By providing accessible ways to constrain LLM outputs to specific formats like JSON, multiple choice, data types, and regular expression patterns, Outlines enables developers to more seamlessly incorporate the power of LLMs into applications and workflows that rely on structured data interchange. This opens up exciting possibilities for building more robust and reliable LLM-powered systems that can intelligently interface with traditional software architectures.