Spaces:
Running
Constrained Decoding in LiteRT-LM
LiteRT-LM supports constrained decoding, allowing you to enforce specific structures on the model's output. This is particularly useful for tasks like:
- Function Calling: Ensuring the model outputs a valid function call matching a specific schema.
- Structured Data Extraction: Forcing the model to adhere to a specific format (e.g., specific regex patterns).
- Grammar Enforcement: Using context-free grammars (via Lark) to guide generation.
This document explains how to enable, configure, and use constrained decoding in your application.
Enabling Constrained Decoding
To use constrained decoding, you must enable it in the ConversationConfig when
creating your Conversation instance.
#include "runtime/conversation/conversation.h"
// ...
ConversationConfig::Builder builder;
builder.SetEnableConstrainedDecoding(true);
// Set a ConstraintProviderConfig in the ConversationConfig::Builder.
// This line set the ConstraintProvider to LLGuidance with default settings.
builder.SetConstraintProviderConfig(LlGuidanceConfig());
auto config = builder.Build(*engine);
Constraint Providers
LiteRT-LM supports different backends for constrained decoding, configured via
ConstraintProviderConfig:
- LLGuidance (
LlGuidanceConfig): Uses the LLGuidance library. Supports Regex, JSON Schema, and Lark grammars. - External (
ExternalConstraintConfig): Allows passing a pre-constructedConstraintobject per-request. Useful for custom C++ constraint implementations.
Using Constraints in SendMessage
Once enabled, you can apply constraints to individual messages using the
decoding_constraint field in the OptionalArgs struct passed to SendMessage
or SendMessageAsync. This field is of type std::optional<ConstraintArg>.
1. LLGuidance Constraints
LLGuidance constraints can be specified as Regex, JSON Schema, or Lark grammars.
Regex Constraint
Constrain the output to match a regular expression.
#include "runtime/components/constrained_decoding/llg_constraint_config.h"
// ...
LlGuidanceConstraintArg constraint_arg;
constraint_arg.constraint_type = LlgConstraintType::kRegex;
// Example: Force output to be a sequence of 'a's followed by 'b's
constraint_arg.constraint_string = "a+b+";
auto response = conversation->SendMessage(
user_message,
{.decoding_constraint = constraint_arg}
);
JSON Schema Constraint
Constrain the output to be a valid JSON object matching a schema.
LlGuidanceConstraintArg constraint_arg;
constraint_arg.constraint_type = LlgConstraintType::kJsonSchema;
// Example: Simple JSON object with a "name" field
constraint_arg.constraint_string = R"({
"type": "object",
"properties": {
"name": {"type": "string"}
},
"required": ["name"]
})";
auto response = conversation->SendMessage(
user_message,
{.decoding_constraint = constraint_arg}
);
Lark Grammar Constraint
Constrain the output to follow a Lark grammar.
LlGuidanceConstraintArg constraint_arg;
constraint_arg.constraint_type = LlgConstraintType::kLark;
// Example: A simple calculator grammar
constraint_arg.constraint_string = R"(
start: expr
expr: atom
| expr "+" atom
| expr "-" atom
| expr "*" atom
| expr "/" atom
| "(" expr ")"
atom: /[0-9]+/
WS: /[ \t\n\f]+/
%ignore WS
)";
auto response = conversation->SendMessage(
user_message,
{.decoding_constraint = constraint_arg}
);
2. External Constraints
If you have a custom implementation of the Constraint interface (e.g., a
highly specialized C++ state machine), you can use ExternalConstraintArg.
Prerequisite: You must have initialized Conversation with
ExternalConstraintConfig.
// 1. Initialize with ExternalConstraintConfig
auto config = ConversationConfig::Builder()
.SetEnableConstrainedDecoding(true)
.SetConstraintProviderConfig(ExternalConstraintConfig())
.Build(*engine);
auto conversation = Conversation::Create(*engine, config);
// 2. Create your custom constraint (must implement litert::lm::Constraint)
class MyCustomConstraint : public Constraint {
// Implement Start, ComputeNext, etc.
};
auto my_constraint = std::make_unique<MyCustomConstraint>();
// 3. Pass it to SendMessage
ExternalConstraintArg external_constraint;
external_constraint.constraint = std::move(my_constraint);
auto response = conversation->SendMessage(
user_message,
{.decoding_constraint = std::move(external_constraint)}
);
API Reference
ConstraintProviderConfig
A variant configuration passed to ConversationConfig.
LlGuidanceConfig: Configures LLGuidance.eos_id: Optional override for the End-of-Sequence token ID.
ExternalConstraintConfig: Empty struct (marker) to enable external constraints.
ConstraintArg
A variant argument passed via OptionalArgs to SendMessage.
LlGuidanceConstraintArg:constraint_type:kRegex,kJsonSchema, orkLark.constraint_string: The pattern/schema/grammar string.
ExternalConstraintArg:constraint:std::unique_ptr<Constraint>. Ownership is transferred to the valid decoder for that request.