LLM Patterns
#Updated 8/5/2024
Several patterns for effectively working with Large Language Models (LLMs) in more complex applications. Relevant for developers and engineers working on advanced LLM applications, especially those dealing with complex tasks, large datasets, or real-time interactions.
- Pattern: Prompt Orchestration
- Pattern: Hierarchical Context Compression
- Pattern: JSON Schema Instructions
- Pattern: Auto Completion Of Streaming JSON
See AWS Lambda Proxy For AI Services for AWS lambda code that proxies calls to OpenAI, AWS Bedrock, and AWS Polly, exposing OpenAI-style models, completions, transcriptions, and speech endpoints.
Pattern: Prompt Orchestration
#%%{init: {'theme':'dark'}}%% graph LR subgraph Before A[Complex One-Shot Task] --> B[LLM] B -->|Failure| C((❌)) end subgraph After D[Complex Task] --> E{Decomposition} E --> F[Simple Task 1] E --> G[Simple Task 2] E --> H[Simple Task n] F & G & H --> I subgraph I[Orchestrator] J[Task 1 + LLM] --> K[Task 2 + LLM] K --> L[Task n + LLM] end I -->|Success| M((✓)) end Before ~~~ After
Problem
#- LLMs are very limited. They can reliably do some, simple, small reasoning tasks.
- But they struggle to do large, complex tasks.
Solution
#- Break big, complex tasks down into an orchestrated set of small tasks.
- Identify limitations of the LLM by writing a naive, one-shot instruction for a complex task.
- Allow failure cases to inform the breakdown of the complex task into a series of simple tasks.
- Arrange the simple tasks in sequence with traditional code, managing the flow of information between them.
Limitations
#- With enough elbow grease applied to the decomposition of complex tasks into series of simple tasks, LLMs can do anything.
- But just because they can doesn't mean they should.
- Breaking down big, complex tasks in such a way that respects LLM limitations is a lot of work.
- And the final result may be too costly and/or slow to be worth the effort.
Example: Complex Game Logic
#Liminai uses this pattern extensively in the context of handling complex, branching game logic.
%%{init: {'theme':'dark'}}%% flowchart LR UserAction PossibilityChecker["Task: Determine Possibility"] isPossible{"Is Possible?"} failurePoemWriter["Task: Generate Failure Poem"] onDeath["Task: Handle Death"] CommentaryPoemGenerator["Task: Generate Commentary Poem"] AnnounceWinner["Announce Winner"] MovementChecker["Task: Determine Movement"] isMovement{"Is Movement?"} WinChecker["Task: Check if Win Conditions Met"] isDead{"Is Dead?"} hasWon{"Has Won?"} subgraph resultsHandler["Tasks For Handling Results"] direction LR ActionResultGenerator["Task: Generate Action Result"] HealthAndInventoryUpdater["Task: Update Health and Inventory"] DeathChecker["Task: Check if Dead"] end subgraph locationHandler["Tasks For Handling Locations"] direction LR LocationGenerator["Task: Generate Location"] ImageTagGenerator["Task: Generate Location Image Tags"] ImageDiffusionService["Task: Generate Image"] end UserAction-->PossibilityChecker-->isPossible isPossible--"No"-->failurePoemWriter-->end1((End)) isPossible--"Yes"-->MovementChecker MovementChecker-->isMovement isMovement--"Yes"-->locationHandler-->end2((End)) isMovement--"No"-->resultsHandler-->isDead isDead--"Yes"-->onDeath-->end3((End)) isDead--"No"-->WinChecker WinChecker-->hasWon hasWon--"Yes"-->AnnounceWinner-->end4((End)) hasWon--"No"-->CommentaryPoemGenerator-->end5((End))
This example is oversimplified for brevity and for the sake of illustrating the pattern.
- To simulate real-world constraints, a task checks whether or not the User Action is possible given current conditions.
- To enable player exploration, a set of tasks checks whether or not the User Action involves movement to a new location and generates that location and an accompanying image.
- To enable interactivity between the player and environment, a set of tasks generates the result of the action, updates health and inventory, and checks whether or not the player has died.
- To enable a dynamic win condition, a task checks whether or not the result of the action fulfills a loosely defined win condition.
Pattern: Hierarchical Context Compression
#%%{init: {'theme':'dark'}}%% graph LR CS[Context Selector] TI[Task Instruction] Compressor -->|"Used In"| HCB -->|"Yields"| HC TI-->|"Sent To"| LLM User["fa:fa-user User Action Input"] -->|"Sent To"| TI HC -->|"Filtered By"| CS -->|"Sends Context To"| TI subgraph Compressor direction TB I(Info Input) --> |"Embedded In"| CI(Compression Instruction) -->|"Given To"| CLLM[Compression LLM] -->|"Responds With"| CO(Compressed Output) end subgraph HCB[Hierarchical Context Builder] D1(Detailed Info 1) D2(Detailed Info 2) SC[[Summary Compressor]] S1("Summary Of Info 1") S2("Summary Of Info 2") MSC[[Meta Summary Compressor]] MS("Summary Of Summaries 1 & 2") D1 & D2 -.->|"Input To"| SC -.->|"Outputs"| S1 & S2 S1 & S2 -.->|"Input To"| MSC -.->|"Outputs"| MS end subgraph HC[Hierarchical Context] direction LR D(Detailed Info) S(Summary Info) MSS(Meta Summary Info) end
Background
#- LLMs have a context window that represents the amount of information they can see and reason about at once.
- For modern LLMs, this context window can be big-- big enough to store the full text of a large book.
Problem
#- A large context window does not equate to an ability to effectively reason about that context.
- A large context window filled with information irrelevant to the current task results in worse outcomes.
- Filling a large context window increases LLM response latency and dramatically increases cost.
Solution
#- Send only content that is strictly necessary to perform a task.
- Store information at multiple levels of detail, from comprehensive to highly condensed.
- Progressively compress older or accumulated information into more concise forms.
- Provide the LLM with the most appropriate level of detail for each task, favoring compressed versions when possible.
Example: Text Adventure Game Turns
#JournAI uses this pattern in the context of a text adventure game, allowing the LLM to maintain context across the entire game with a limited input window.
%%{init: {'theme':'dark'}}%% graph LR subgraph GH["Game History"] DR[Detailed Results] SR[Summarized Results] CS[Chapter Summaries] NPCI[NPC Information] NPCS[NPC Summaries] end subgraph CSFL["Context Selection"] direction LR VRDR[Very Recent Detailed Results] RSR[Recent Summarized Results] ACS[All Chapter Summaries] RDNPCI[Recently Discovered NPC Information] ANPCS[All NPC Summaries] end PA[Player Action] --> |Triggers| CSFL CSFL-->|"Sent To"| LLM[LLM for Game Logic] CSFL-->|"Fetches From"|GH LLM --> |Generates| NGO[New Game Output] NGO --> |Updates| DR NGO --> |Updates| SR NGO --> |May Update| NPCI SR --> |Periodically Summarized| CS NPCI --> |Periodically Summarized| NPCS
The Game History is composed of:
- Detailed Results (accumulated each turn)
- Summarized Results (accumulated each turn)
- Chapter Summaries (periodically generated)
- NPC Information (accumulated each turn)
- NPC Summaries (periodically generated)
When players execute an action, a subset of the Game History is included in the LLM instruction.
- Very Recent Detailed Results (e.g. Last 5)
- Recent Summarized Results (e.g. Last 20)
- All Chapter Summaries
- Recently Discovered NPC Information (e.g. Last 20)
- All NPC Summaries
Example: Coding Assistant
#AutoCoder uses this pattern in the context of a Coding Assistant, allowing the LLM to maintain context across an entire codebase with a limited input window.
%%{init: {'theme':'dark'}}%% graph LR UC[User Command] --> RDP IP[Ingestion Process] -->|"Generates"| PC subgraph PC[Project Context] direction TB FC[File Contents] FS[File Summaries] FB[File Blurbs] DS[Directory Summaries] FO[Feature Overviews] FC-->|"Compressed Into"|FS-->|"Compressed Into"|FB FB-->|"Compressed Into"|DS & FO end PC --> RDP subgraph RDP[Relevancy Determination Process] direction TB subgraph Task1[Task 1: Determine Relevant File Blurbs] direction LR AFB[All File Blurbs] AFO[All Feature Overviews] ADS[All Directory Summaries] UC1[User Command] end subgraph Task2[Task 2: Determine Relevant File Summaries] direction LR RFB[Relevant File Blurbs] UC2[...] end subgraph Task3[Task 3: Determine Relevant File Contents] direction LR RFS[Relevant File Summaries] UC3[...] end Task1-->|"Send Blurbs To"|Task2 Task2-->|"Send Summaries To"|Task3 end RDP -->|"Outputs"| SC[Selected Context] subgraph SC[Selected Context] direction LR AFO1[All Feature Overviews] RF2[Relevant File Blurbs] RMSI[Relevant File Summaries] RMFC[Relevant File Contents] end SC --> LLM[LLM for Coding Assistant] LLM -->|"Generates"| CO[Code Change Output]
The Project Context consists of:
- Full, Uncompressed File Contents
- File Summaries (generated from File Contents during ingestion)
- File Blurbs (generated from File Summaries during ingestion)
- Directory Summaries (generated from File Blurbs during ingestion)
- Feature Overviews (generated from File Blurbs during ingestion)
When users execute a command, a relevant subset of the Project Context is included in the LLM instruction, with relevancy (in relation to the User Command) determined by the LLM.
- All Feature Overviews
- Relevant File Blurbs (describing the high-level purpose of all related files)
- Relevant File Summaries (describing the interfaces for more involved files)
- Relevant File Contents (the full contents for key files)
Relevancy is determined by the following process prior to command execution:
- Provide LLM with the User Command & All File Blurbs, instructing it to greedily determine which files are relevant.
- Provide LLM with the User Command & Relevant File Blurbs, asking it which files it would need to see summaries for.
- Provide LLM with the User Command & Relevant File Summaries, asking it which files it would need to see the full contents for.
Pattern: JSON Schema Instructions
#%%{init: {'theme':'dark'}}%% graph LR subgraph TI[LLM Task Instruction] direction LR TIJ[Task Input JSON] TOJS[Task Output JSON Schema] INS[Transformation Instruction] end Raw[Task Output JSON String] DM[Defensive Marshaler] TO[Task Output JSON] TI -->|"Outputs"| Raw -->|"Sent To"| DM-->|"Retry On Failure"|TI DM -->|"Outputs"|TO
Background
#LLM stands for Large Language Model, meaning LLMs typically specialize in outputting plain text.
Problem
#In a programmatic environment, especially when orchestrating the flow of information between multiple prompts, plain text is inadequate and must be marshalled into structured data.
Solution
#- Provide the LLM with JSON Schema for the desired, structured output, and instruct it to respond with JSON that fulfills the schema.
- Order JSON properties in the Schema such that they represent an ordered, logical process, with later properties dependent on the value of earlier properties.
- Defensively marshal the LLM output into JSON, accounting for type variance and incorporating retry logic on parsing failure.
Example: Possibility Checker
#{ "action": "Refurbish a bathroom", "actor": { "name": "John Doe", "age": 30, "skills": ["JavaScript", "React", "Node.js"] }, "initialConditions": ["Bathroom is delapidated and non-functional", "Plumbing shot", "Wiring degraded"] }
// Task Output JSON Schema { "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://example.com/actionPossibility.schema.json", "title": "Action Possibility", "description": "Whether or not an action is possible, based on the actor and initial conditions.", "type": "object", "properties": { "actorFactors": { "description": "Given the actor, what factors are at play that influence the possibility of the action?", "type": "array", "items": { "type": "string" } }, "initialConditionFactors": { "description": "Given the initial conditions, what factors are at play that influence the possibility of the action?", "type": "array", "items": { "type": "string" } }, "isPossible": { "description": "Whether or not the action is possible.", "type": "boolean" } } }
# Input {Task Input JSON} # Output Format JSON Schema {Task Output JSON Schema} # Instruction Given the Input, output a JSON object based on the Output Format JSON Schema. Output only valid JSON, saying nothing else.
Pattern: Auto Completion Of Streaming JSON
#%%{init: {'theme':'dark'}}%% graph LR PJS[Partial JSON String] subgraph FJO[Fixed JSON Object] direction LR FP1[Completed / Fixed Property #1] FP2[Completed / Fixed Property #2] end subgraph UI UI1[First UI Element] UI2[Second UI Element] end subgraph EP1[Property #1] direction TB V1[Value 1] D1[Default 1] V1-->|"If Not Found Defaults To"|D1 end subgraph EP2[Property #2] direction TB V2[Value 2] D2[Default 2] V2-->|"If Not Found Defaults To"|D2 end subgraph EJO[Example JSON Object] EP1 EP2 end subgraph PJF[Partial JSON Fixer] EJO end PJS -->|"Sent To"| PJF -->|"Outputs"| FJO FP1 -->|"Updates"| UI1 FP2 -->|"Updates"| UI2
Background
#- LLMs can stream their output token by token, allowing for a more responsive user experience.
Problem
#- However, streaming structured data like JSON presents challenges, as the JSON structure may not be complete until the entire response is generated.
- When streaming LLM outputs, we need a way to handle incomplete JSON structures while still providing real-time updates to the user interface.
Solution
#- Implement a JSON fixing mechanism that can handle incomplete JSON structures.
- Define an example object that represents the expected JSON structure and provides default values for all properties.
- Ensure that the expected structure and example object have their properties ordered in alignment with the UI components they stream to.
Example
#- JournAI uses this pattern to stream action results, stat changes, and character information to different sections of a UI via a Streaming JSON Fixer Class