SGR Patterns
Here is a set of minimal Pydantic schemas that demonstrate foundational building blocks for Schema-Guided Reasoning (SGR). They illustrate how to encode a specific reasoning pattern that will constrain and guide LLM generation.
1. Cascade
Cascade ensures that LLM explicitly follows predefined reasoning steps while solving the problem. Each step - allocating thinking budget to take reasoning one step further
For example, in a candidate interview evaluation we can enforce the model to:
- First summarize and review its knowledge of the candidate. This will make it explicit for the LLM (putting it into the attention) and for human reviewers later.
- Then rate candidate on the applicability from 1 to 10
- Finally make a final decision as a choice between
hire
,reject
orhold
This is how the corresponding Pydantic schema would look like:
from pydantic import BaseModel, conint
class CandidateEvaluation(BaseModel):
brief_candidate_summary: str
rate_skill_match: conint(ge=1, le=10)
final_recommendation: Literal["hire", "reject", "hold"]
The schema explicitly defines and constrains the order of reasoning: first summarize, then rate, and finally recommend. LLM, driven by the constrained decoding, will reason in this predefined logical sequence.
It can be plugged into OpenAI-compatible library like this:
from openai import OpenAI
client = OpenAI()
user = "evaluate Sam Altman for DevOps Role at OpenAI"
completion = client.chat.completions.parse(
model="gpt-5-mini",
response_format=CandidateEvaluation,
messages=[
{"role": "user", "content": user },
],
)
and the model will be forced by constrained decoding to structure its response accordingly:
CandidateEvaluation(
brief_candidate_summary=(
'Sam Altman is a high-profile technology executive and entrepreneur '
'(co-founder of Loopt, president of Y Combinator, CEO of OpenAI) with '
'strong leadership, strategy, product and fundraising experience. '
'Publicly available information highlights executive management and '
'company-building skills rather than hands-on systems engineering, SRE, '
'or platform/DevOps work. He would bring strategic vision and '
'organizational leadership but not the typical deep, day-to-day '
'operational expertise expected for an individual contributor DevOps '
'role.'
),
rate_skill_match=2,
final_recommendation='reject'
)
Note, that we order parameters to gradually focus and refine the information, until we come up with a concrete conclusion. Start by a generic summary of the candidate, narrow down to the skill rating and end up with a concrete decision.
If LLM starts misbehaving in some situations, it would be possible to load back full SGR outlines for these cases and review them.
2. Routing
Routing forces LLM to explicitly choose one specific reasoning path out of many. For example, in software triage we can force LLM to explicitly choose the path ("hardware" or "software"), followed by filling specific required details:
from pydantic import BaseModel
from typing import Literal, Union
class HardwareIssue(BaseModel):
kind: Literal["hardware"]
component: Literal["battery", "display", "keyboard"]
class SoftwareIssue(BaseModel):
kind: Literal["software"]
software_name: str
class UnknownIssue(BaseModel):
kind: Literal["unknown"]
category: str
summary: str
class SupportTriage(BaseModel):
issue: Union[HardwareIssue, SoftwareIssue, UnknownIssue]
By passing SupportTriage
to response_format
, we will force LLM to make a choice and pick one of the branches.
completion = client.chat.completions.parse(
model="gpt-5-mini",
response_format=SupportTriage,
messages=[
{"role": "developer", "content": "triage support"},
{"role": "user", "content": "My laptop screen keeps flickering and sometimes turns black." }
],
)
print(completion.choices[0].message.parsed)
Parsed object will be of type HardwareIssue
in this case:
SupportTriage(
issue=HardwareIssue(kind='hardware', component='display')
)
Tools can be represented with branches as well. Consider this schema for a personal business assistant that has access to a few tools:
from pydantic import BaseModel, Field
from typing import Union, Literal
class SendEmailTool(BaseModel):
tool: Literal["send_email"]
recipient_email: str
subject: str
message: str
class SearchKnowledgeBaseTool(BaseModel):
tool: Literal["search_knowledge_base"]
query: str
class CreateSupportTicketTool(BaseModel):
tool: Literal["create_support_ticket"]
customer_id: int
issue_summary: str
priority: Literal["low", "medium", "high"]
class Response(BaseModel):
action: Union[SendEmailTool, SearchKnowledgeBaseTool, CreateSupportTicketTool]
summary: str
Here is how we can use this in action:
system = "handle request of Rinat - support agent. Don't make things up"
user = "Email to jessica@example.com, tell that her refund has been processed"
completion = client.chat.completions.parse(
model="gpt-5-mini",
response_format=Response,
messages=[
{"role": "developer", "content": system },
{"role": "user", "content": user }
],
)
Response can look like:
action = SendEmailTool(
tool='send_email',
recipient_email='jessica@example.com',
subject='Your refund has been processed',
message=(
'Hi Jessica,\n\nYour refund has been processed. If you do not see the '
'refund on your account or have any questions, please reply to this '
'email and I will investigate.\n\nBest,\nRinat\nCustomer Support'
)
)
summary = 'Email notifying Jessica that her refund has been processed.'
This is how we can wrap this code with actual tool calling:
# ----- Mock Tool Implementations -----
def send_email(recipient_email: str, subject: str, message: str):
print(f"Sending email to {recipient_email} with subject '{subject}'")
print(f"Body:\n{message}\n")
def search_knowledge_base(query: str):
print(f"Searching KB for: {query}")
def create_support_ticket(customer_id: int, issue_summary: str, priority: str):
print(f"Creating {priority} priority ticket for customer {customer_id}")
print(f"Issue: {issue_summary}")
# Map tool type to handler
TOOL_DISPATCH: Dict[str, Callable] = {
"send_email": send_email,
"search_knowledge_base": search_knowledge_base,
"create_support_ticket": create_support_ticket
}
# ----- LLM Wrapper -----
def handle_request(system_prompt: str, user_prompt: str):
completion = openai.chat.completions.parse(
model="gpt-5-mini",
response_format=Response,
messages=[
{"role": "developer", "content": system },
{"role": "user", "content": user }
],
)
response = completion.choices[0].message.parsed
print(f"Summary: {response.summary}")
tool_type = response.action.tool
if tool_type in TOOL_DISPATCH:
TOOL_DISPATCH[tool_type](response.action)
else:
print(f"Unknown tool: {tool_type}")
3. Cycle
Cycle explicitly forces to repeat reasoning steps.
Here we are forcing LLM to come up with multiple risk factors. At least two, but no more than four:
from pydantic import BaseModel
from typing import List, Literal
from annotated_types import MinLen, MaxLen
class RiskFactor(BaseModel):
explanation: str
severity: Literal["low", "medium", "high"]
class RiskAssessment(BaseModel):
factors: Annotated[List[RiskFactor], MinLen(2), MaxLen(4)]
And the execution:
user = "The server room has poor ventilation and outdated surge protectors."
completion = client.chat.completions.parse(
model="gpt-5-mini",
response_format=RiskAssessment,
messages=[
{"role": "developer", "content": "be brief" },
{"role": "user", "content": user }
],
)
response:
factors = [
RiskFactor(
explanation=(
"Poor ventilation leading to elevated temperatures, increased "
"risk of thermal shutdown, shortened hardware lifespan, and "
"potential downtime."
),
severity="high"
),
RiskFactor(
explanation=(
"Outdated surge protectors that may not adequately guard against "
"voltage spikes or electrical faults, raising risk of hardware "
"damage and data loss; replace with modern surge/UPS protection."
),
severity="high"
)
]
By the way, we can use Cycle to extend the schema from the tool calling example to enable parallel tool execution like this:
class Response(BaseModel):
action: List[Union[SendEmailTool, SearchKnowledgeBaseTool, CreateSupportTicketTool]]
summary: str
Now the response will contain a list of different tool calls that we can dispatch in parallel before passing the results back to LLM for further processing.
Next post in Ship with ChatGPT story: SGR Examples
🤗 Check out my newsletter! It is about building products with ChatGPT and LLMs: latest news, technical insights and my journey. Check out it out