LiteRT-LM / docs /api /cpp /constrained-decoding.md
SeaWolf-AI's picture
Upload full LiteRT-LM codebase
5f923cd verified

Constrained Decoding in LiteRT-LM

LiteRT-LM supports constrained decoding, allowing you to enforce specific structures on the model's output. This is particularly useful for tasks like:

  • Function Calling: Ensuring the model outputs a valid function call matching a specific schema.
  • Structured Data Extraction: Forcing the model to adhere to a specific format (e.g., specific regex patterns).
  • Grammar Enforcement: Using context-free grammars (via Lark) to guide generation.

This document explains how to enable, configure, and use constrained decoding in your application.

Enabling Constrained Decoding

To use constrained decoding, you must enable it in the ConversationConfig when creating your Conversation instance.

#include "runtime/conversation/conversation.h"

// ...

ConversationConfig::Builder builder;
builder.SetEnableConstrainedDecoding(true);

// Set a ConstraintProviderConfig in the ConversationConfig::Builder.
// This line set the ConstraintProvider to LLGuidance with default settings.
builder.SetConstraintProviderConfig(LlGuidanceConfig());

auto config = builder.Build(*engine);

Constraint Providers

LiteRT-LM supports different backends for constrained decoding, configured via ConstraintProviderConfig:

  1. LLGuidance (LlGuidanceConfig): Uses the LLGuidance library. Supports Regex, JSON Schema, and Lark grammars.
  2. External (ExternalConstraintConfig): Allows passing a pre-constructed Constraint object per-request. Useful for custom C++ constraint implementations.

Using Constraints in SendMessage

Once enabled, you can apply constraints to individual messages using the decoding_constraint field in the OptionalArgs struct passed to SendMessage or SendMessageAsync. This field is of type std::optional<ConstraintArg>.

1. LLGuidance Constraints

LLGuidance constraints can be specified as Regex, JSON Schema, or Lark grammars.

Regex Constraint

Constrain the output to match a regular expression.

#include "runtime/components/constrained_decoding/llg_constraint_config.h"

// ...

LlGuidanceConstraintArg constraint_arg;
constraint_arg.constraint_type = LlgConstraintType::kRegex;
// Example: Force output to be a sequence of 'a's followed by 'b's
constraint_arg.constraint_string = "a+b+";

auto response = conversation->SendMessage(
    user_message,
    {.decoding_constraint = constraint_arg}
);

JSON Schema Constraint

Constrain the output to be a valid JSON object matching a schema.

LlGuidanceConstraintArg constraint_arg;
constraint_arg.constraint_type = LlgConstraintType::kJsonSchema;
// Example: Simple JSON object with a "name" field
constraint_arg.constraint_string = R"({
  "type": "object",
  "properties": {
    "name": {"type": "string"}
  },
  "required": ["name"]
})";

auto response = conversation->SendMessage(
    user_message,
    {.decoding_constraint = constraint_arg}
);

Lark Grammar Constraint

Constrain the output to follow a Lark grammar.

LlGuidanceConstraintArg constraint_arg;
constraint_arg.constraint_type = LlgConstraintType::kLark;
// Example: A simple calculator grammar
constraint_arg.constraint_string = R"(
    start: expr
    expr: atom
        | expr "+" atom
        | expr "-" atom
        | expr "*" atom
        | expr "/" atom
        | "(" expr ")"
    atom: /[0-9]+/
    WS: /[ \t\n\f]+/
    %ignore WS
)";

auto response = conversation->SendMessage(
    user_message,
    {.decoding_constraint = constraint_arg}
);

2. External Constraints

If you have a custom implementation of the Constraint interface (e.g., a highly specialized C++ state machine), you can use ExternalConstraintArg.

Prerequisite: You must have initialized Conversation with ExternalConstraintConfig.

// 1. Initialize with ExternalConstraintConfig
auto config = ConversationConfig::Builder()
    .SetEnableConstrainedDecoding(true)
    .SetConstraintProviderConfig(ExternalConstraintConfig())
    .Build(*engine);
auto conversation = Conversation::Create(*engine, config);

// 2. Create your custom constraint (must implement litert::lm::Constraint)
class MyCustomConstraint : public Constraint {
    // Implement Start, ComputeNext, etc.
};
auto my_constraint = std::make_unique<MyCustomConstraint>();

// 3. Pass it to SendMessage
ExternalConstraintArg external_constraint;
external_constraint.constraint = std::move(my_constraint);

auto response = conversation->SendMessage(
    user_message,
    {.decoding_constraint = std::move(external_constraint)}
);

API Reference

ConstraintProviderConfig

A variant configuration passed to ConversationConfig.

  • LlGuidanceConfig: Configures LLGuidance.
    • eos_id: Optional override for the End-of-Sequence token ID.
  • ExternalConstraintConfig: Empty struct (marker) to enable external constraints.

ConstraintArg

A variant argument passed via OptionalArgs to SendMessage.

  • LlGuidanceConstraintArg:
    • constraint_type: kRegex, kJsonSchema, or kLark.
    • constraint_string: The pattern/schema/grammar string.
  • ExternalConstraintArg:
    • constraint: std::unique_ptr<Constraint>. Ownership is transferred to the valid decoder for that request.