Updated 8/5/2024
Several patterns for effectively working with Large Language Models (LLMs) in more complex applications. Relevant for developers and engineers working on advanced LLM applications, especially those dealing with complex tasks, large datasets, or real-time interactions.
- Pattern: Prompt Orchestration
- Pattern: Hierarchical Context Compression
- Pattern: JSON Schema Instructions
- Pattern: Auto Completion Of Streaming JSON
See AWS Lambda Proxy For AI Services for AWS lambda code that proxies calls to OpenAI, AWS Bedrock, and AWS Polly, exposing OpenAI-style models, completions, transcriptions, and speech endpoints.
Pattern: Prompt Orchestration
#%%{init: {'theme':'dark'}}%%
graph LR
subgraph Before
A[Complex One-Shot Task] --> B[LLM]
B -->|Failure| C((❌))
end
subgraph After
D[Complex Task] --> E{Decomposition}
E --> F[Simple Task 1]
E --> G[Simple Task 2]
E --> H[Simple Task n]
F & G & H --> I
subgraph I[Orchestrator]
J[Task 1 + LLM] --> K[Task 2 + LLM]
K --> L[Task n + LLM]
end
I -->|Success| M((✓))
end
Before ~~~ After
Problem
#- LLMs are very limited. They can reliably do some, simple, small reasoning tasks.
- But they struggle to do large, complex tasks.
Solution
#- Break big, complex tasks down into an orchestrated set of small tasks.
- Identify limitations of the LLM by writing a naive, one-shot instruction for a complex task.
- Allow failure cases to inform the breakdown of the complex task into a series of simple tasks.
- Arrange the simple tasks in sequence with traditional code, managing the flow of information between them.
Limitations
#- With enough elbow grease applied to the decomposition of complex tasks into series of simple tasks, LLMs can do anything.
- But just because they can doesn't mean they should.
- Breaking down big, complex tasks in such a way that respects LLM limitations is a lot of work.
- And the final result may be too costly and/or slow to be worth the effort.
Example: Complex Game Logic
#Liminai uses this pattern extensively in the context of handling complex, branching game logic.
%%{init: {'theme':'dark'}}%%
flowchart LR
UserAction
PossibilityChecker["Task: Determine Possibility"]
isPossible{"Is Possible?"}
failurePoemWriter["Task: Generate Failure Poem"]
onDeath["Task: Handle Death"]
CommentaryPoemGenerator["Task: Generate Commentary Poem"]
AnnounceWinner["Announce Winner"]
MovementChecker["Task: Determine Movement"]
isMovement{"Is Movement?"}
WinChecker["Task: Check if Win Conditions Met"]
isDead{"Is Dead?"}
hasWon{"Has Won?"}
subgraph resultsHandler["Tasks For Handling Results"]
direction LR
ActionResultGenerator["Task: Generate Action Result"]
HealthAndInventoryUpdater["Task: Update Health and Inventory"]
DeathChecker["Task: Check if Dead"]
end
subgraph locationHandler["Tasks For Handling Locations"]
direction LR
LocationGenerator["Task: Generate Location"]
ImageTagGenerator["Task: Generate Location Image Tags"]
ImageDiffusionService["Task: Generate Image"]
end
UserAction-->PossibilityChecker-->isPossible
isPossible--"No"-->failurePoemWriter-->end1((End))
isPossible--"Yes"-->MovementChecker
MovementChecker-->isMovement
isMovement--"Yes"-->locationHandler-->end2((End))
isMovement--"No"-->resultsHandler-->isDead
isDead--"Yes"-->onDeath-->end3((End))
isDead--"No"-->WinChecker
WinChecker-->hasWon
hasWon--"Yes"-->AnnounceWinner-->end4((End))
hasWon--"No"-->CommentaryPoemGenerator-->end5((End))
This example is oversimplified for brevity and for the sake of illustrating the pattern.
- To simulate real-world constraints, a task checks whether or not the User Action is possible given current conditions.
- To enable player exploration, a set of tasks checks whether or not the User Action involves movement to a new location and generates that location and an accompanying image.
- To enable interactivity between the player and environment, a set of tasks generates the result of the action, updates health and inventory, and checks whether or not the player has died.
- To enable a dynamic win condition, a task checks whether or not the result of the action fulfills a loosely defined win condition.
Pattern: Hierarchical Context Compression
#%%{init: {'theme':'dark'}}%%
graph LR
CS[Context Selector]
TI[Task Instruction]
Compressor -->|"Used In"| HCB -->|"Yields"| HC
TI-->|"Sent To"| LLM
User["fa:fa-user User Action Input"] -->|"Sent To"| TI
HC -->|"Filtered By"| CS -->|"Sends Context To"| TI
subgraph Compressor
direction TB
I(Info Input)
--> |"Embedded In"| CI(Compression Instruction)
-->|"Given To"| CLLM[Compression LLM]
-->|"Responds With"| CO(Compressed Output)
end
subgraph HCB[Hierarchical Context Builder]
D1(Detailed Info 1)
D2(Detailed Info 2)
SC[[Summary Compressor]]
S1("Summary Of Info 1")
S2("Summary Of Info 2")
MSC[[Meta Summary Compressor]]
MS("Summary Of Summaries 1 & 2")
D1 & D2 -.->|"Input To"| SC -.->|"Outputs"| S1 & S2
S1 & S2 -.->|"Input To"| MSC -.->|"Outputs"| MS
end
subgraph HC[Hierarchical Context]
direction LR
D(Detailed Info)
S(Summary Info)
MSS(Meta Summary Info)
end
Background
#- LLMs have a context window that represents the amount of information they can see and reason about at once.
- For modern LLMs, this context window can be big-- big enough to store the full text of a large book.
Problem
#- A large context window does not equate to an ability to effectively reason about that context.
- A large context window filled with information irrelevant to the current task results in worse outcomes.
- Filling a large context window increases LLM response latency and dramatically increases cost.
Solution
#- Send only content that is strictly necessary to perform a task.
- Store information at multiple levels of detail, from comprehensive to highly condensed.
- Progressively compress older or accumulated information into more concise forms.
- Provide the LLM with the most appropriate level of detail for each task, favoring compressed versions when possible.
Example: Text Adventure Game Turns
#JournAI uses this pattern in the context of a text adventure game, allowing the LLM to maintain context across the entire game with a limited input window.
%%{init: {'theme':'dark'}}%%
graph LR
subgraph GH["Game History"]
DR[Detailed Results]
SR[Summarized Results]
CS[Chapter Summaries]
NPCI[NPC Information]
NPCS[NPC Summaries]
end
subgraph CSFL["Context Selection"]
direction LR
VRDR[Very Recent Detailed Results]
RSR[Recent Summarized Results]
ACS[All Chapter Summaries]
RDNPCI[Recently Discovered NPC Information]
ANPCS[All NPC Summaries]
end
PA[Player Action] --> |Triggers| CSFL
CSFL-->|"Sent To"| LLM[LLM for Game Logic]
CSFL-->|"Fetches From"|GH
LLM --> |Generates| NGO[New Game Output]
NGO --> |Updates| DR
NGO --> |Updates| SR
NGO --> |May Update| NPCI
SR --> |Periodically Summarized| CS
NPCI --> |Periodically Summarized| NPCS
The Game History is composed of:
- Detailed Results (accumulated each turn)
- Summarized Results (accumulated each turn)
- Chapter Summaries (periodically generated)
- NPC Information (accumulated each turn)
- NPC Summaries (periodically generated)
When players execute an action, a subset of the Game History is included in the LLM instruction.
- Very Recent Detailed Results (e.g. Last 5)
- Recent Summarized Results (e.g. Last 20)
- All Chapter Summaries
- Recently Discovered NPC Information (e.g. Last 20)
- All NPC Summaries
Example: Coding Assistant
#AutoCoder uses this pattern in the context of a Coding Assistant, allowing the LLM to maintain context across an entire codebase with a limited input window.
%%{init: {'theme':'dark'}}%%
graph LR
UC[User Command] --> RDP
IP[Ingestion Process] -->|"Generates"| PC
subgraph PC[Project Context]
direction TB
FC[File Contents]
FS[File Summaries]
FB[File Blurbs]
DS[Directory Summaries]
FO[Feature Overviews]
FC-->|"Compressed Into"|FS-->|"Compressed Into"|FB
FB-->|"Compressed Into"|DS & FO
end
PC --> RDP
subgraph RDP[Relevancy Determination Process]
direction TB
subgraph Task1[Task 1: Determine Relevant File Blurbs]
direction LR
AFB[All File Blurbs]
AFO[All Feature Overviews]
ADS[All Directory Summaries]
UC1[User Command]
end
subgraph Task2[Task 2: Determine Relevant File Summaries]
direction LR
RFB[Relevant File Blurbs]
UC2[...]
end
subgraph Task3[Task 3: Determine Relevant File Contents]
direction LR
RFS[Relevant File Summaries]
UC3[...]
end
Task1-->|"Send Blurbs To"|Task2
Task2-->|"Send Summaries To"|Task3
end
RDP -->|"Outputs"| SC[Selected Context]
subgraph SC[Selected Context]
direction LR
AFO1[All Feature Overviews]
RF2[Relevant File Blurbs]
RMSI[Relevant File Summaries]
RMFC[Relevant File Contents]
end
SC --> LLM[LLM for Coding Assistant]
LLM -->|"Generates"| CO[Code Change Output]
The Project Context consists of:
- Full, Uncompressed File Contents
- File Summaries (generated from File Contents during ingestion)
- File Blurbs (generated from File Summaries during ingestion)
- Directory Summaries (generated from File Blurbs during ingestion)
- Feature Overviews (generated from File Blurbs during ingestion)
When users execute a command, a relevant subset of the Project Context is included in the LLM instruction, with relevancy (in relation to the User Command) determined by the LLM.
- All Feature Overviews
- Relevant File Blurbs (describing the high-level purpose of all related files)
- Relevant File Summaries (describing the interfaces for more involved files)
- Relevant File Contents (the full contents for key files)
Relevancy is determined by the following process prior to command execution:
- Provide LLM with the User Command & All File Blurbs, instructing it to greedily determine which files are relevant.
- Provide LLM with the User Command & Relevant File Blurbs, asking it which files it would need to see summaries for.
- Provide LLM with the User Command & Relevant File Summaries, asking it which files it would need to see the full contents for.
Pattern: JSON Schema Instructions
#%%{init: {'theme':'dark'}}%%
graph LR
subgraph TI[LLM Task Instruction]
direction LR
TIJ[Task Input JSON]
TOJS[Task Output JSON Schema]
INS[Transformation Instruction]
end
Raw[Task Output JSON String]
DM[Defensive Marshaler]
TO[Task Output JSON]
TI -->|"Outputs"| Raw -->|"Sent To"| DM-->|"Retry On Failure"|TI
DM -->|"Outputs"|TO
Background
#LLM stands for Large Language Model, meaning LLMs typically specialize in outputting plain text.
Problem
#In a programmatic environment, especially when orchestrating the flow of information between multiple prompts, plain text is inadequate and must be marshalled into structured data.
Solution
#- Provide the LLM with JSON Schema for the desired, structured output, and instruct it to respond with JSON that fulfills the schema.
- Order JSON properties in the Schema such that they represent an ordered, logical process, with later properties dependent on the value of earlier properties.
- Defensively marshal the LLM output into JSON, accounting for type variance and incorporating retry logic on parsing failure.
Example: Possibility Checker
#{
"action": "Refurbish a bathroom",
"actor": {
"name": "John Doe",
"age": 30,
"skills": ["JavaScript", "React", "Node.js"]
},
"initialConditions": ["Bathroom is delapidated and non-functional", "Plumbing shot", "Wiring degraded"]
}
// Task Output JSON Schema
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/actionPossibility.schema.json",
"title": "Action Possibility",
"description": "Whether or not an action is possible, based on the actor and initial conditions.",
"type": "object",
"properties": {
"actorFactors": {
"description": "Given the actor, what factors are at play that influence the possibility of the action?",
"type": "array",
"items": {
"type": "string"
}
},
"initialConditionFactors": {
"description": "Given the initial conditions, what factors are at play that influence the possibility of the action?",
"type": "array",
"items": {
"type": "string"
}
},
"isPossible": {
"description": "Whether or not the action is possible.",
"type": "boolean"
}
}
}
# Input
{Task Input JSON}
# Output Format JSON Schema
{Task Output JSON Schema}
# Instruction
Given the Input, output a JSON object based on the Output Format JSON Schema.
Output only valid JSON, saying nothing else.
Pattern: Auto Completion Of Streaming JSON
#%%{init: {'theme':'dark'}}%%
graph LR
PJS[Partial JSON String]
subgraph FJO[Fixed JSON Object]
direction LR
FP1[Completed / Fixed Property #1]
FP2[Completed / Fixed Property #2]
end
subgraph UI
UI1[First UI Element]
UI2[Second UI Element]
end
subgraph EP1[Property #1]
direction TB
V1[Value 1]
D1[Default 1]
V1-->|"If Not Found Defaults To"|D1
end
subgraph EP2[Property #2]
direction TB
V2[Value 2]
D2[Default 2]
V2-->|"If Not Found Defaults To"|D2
end
subgraph EJO[Example JSON Object]
EP1
EP2
end
subgraph PJF[Partial JSON Fixer]
EJO
end
PJS -->|"Sent To"| PJF -->|"Outputs"| FJO
FP1 -->|"Updates"| UI1
FP2 -->|"Updates"| UI2
Background
#- LLMs can stream their output token by token, allowing for a more responsive user experience.
Problem
#- However, streaming structured data like JSON presents challenges, as the JSON structure may not be complete until the entire response is generated.
- When streaming LLM outputs, we need a way to handle incomplete JSON structures while still providing real-time updates to the user interface.
Solution
#- Implement a JSON fixing mechanism that can handle incomplete JSON structures.
- Define an example object that represents the expected JSON structure and provides default values for all properties.
- Ensure that the expected structure and example object have their properties ordered in alignment with the UI components they stream to.
Example
#- JournAI uses this pattern to stream action results, stat changes, and character information to different sections of a UI via a Streaming JSON Fixer Class