# Constrained Decoding in LiteRT-LM LiteRT-LM supports constrained decoding, allowing you to enforce specific structures on the model's output. This is particularly useful for tasks like: - **Function Calling**: Ensuring the model outputs a valid function call matching a specific schema. - **Structured Data Extraction**: Forcing the model to adhere to a specific format (e.g., specific regex patterns). - **Grammar Enforcement**: Using context-free grammars (via Lark) to guide generation. This document explains how to enable, configure, and use constrained decoding in your application. ## Enabling Constrained Decoding To use constrained decoding, you must enable it in the `ConversationConfig` when creating your `Conversation` instance. ```cpp #include "runtime/conversation/conversation.h" // ... ConversationConfig::Builder builder; builder.SetEnableConstrainedDecoding(true); // Set a ConstraintProviderConfig in the ConversationConfig::Builder. // This line set the ConstraintProvider to LLGuidance with default settings. builder.SetConstraintProviderConfig(LlGuidanceConfig()); auto config = builder.Build(*engine); ``` ### Constraint Providers LiteRT-LM supports different backends for constrained decoding, configured via `ConstraintProviderConfig`: 1. **LLGuidance (`LlGuidanceConfig`)**: Uses the [LLGuidance](https://github.com/guidance-ai/llguidance) library. Supports Regex, JSON Schema, and Lark grammars. 2. **External (`ExternalConstraintConfig`)**: Allows passing a pre-constructed `Constraint` object per-request. Useful for custom C++ constraint implementations. ## Using Constraints in `SendMessage` Once enabled, you can apply constraints to individual messages using the `decoding_constraint` field in the `OptionalArgs` struct passed to `SendMessage` or `SendMessageAsync`. This field is of type `std::optional`. ### 1. LLGuidance Constraints LLGuidance constraints can be specified as Regex, JSON Schema, or Lark grammars. #### Regex Constraint Constrain the output to match a regular expression. ```cpp #include "runtime/components/constrained_decoding/llg_constraint_config.h" // ... LlGuidanceConstraintArg constraint_arg; constraint_arg.constraint_type = LlgConstraintType::kRegex; // Example: Force output to be a sequence of 'a's followed by 'b's constraint_arg.constraint_string = "a+b+"; auto response = conversation->SendMessage( user_message, {.decoding_constraint = constraint_arg} ); ``` #### JSON Schema Constraint Constrain the output to be a valid JSON object matching a schema. ```cpp LlGuidanceConstraintArg constraint_arg; constraint_arg.constraint_type = LlgConstraintType::kJsonSchema; // Example: Simple JSON object with a "name" field constraint_arg.constraint_string = R"({ "type": "object", "properties": { "name": {"type": "string"} }, "required": ["name"] })"; auto response = conversation->SendMessage( user_message, {.decoding_constraint = constraint_arg} ); ``` #### Lark Grammar Constraint Constrain the output to follow a Lark grammar. ```cpp LlGuidanceConstraintArg constraint_arg; constraint_arg.constraint_type = LlgConstraintType::kLark; // Example: A simple calculator grammar constraint_arg.constraint_string = R"( start: expr expr: atom | expr "+" atom | expr "-" atom | expr "*" atom | expr "/" atom | "(" expr ")" atom: /[0-9]+/ WS: /[ \t\n\f]+/ %ignore WS )"; auto response = conversation->SendMessage( user_message, {.decoding_constraint = constraint_arg} ); ``` ### 2. External Constraints If you have a custom implementation of the `Constraint` interface (e.g., a highly specialized C++ state machine), you can use `ExternalConstraintArg`. Prerequisite: You must have initialized `Conversation` with `ExternalConstraintConfig`. ```cpp // 1. Initialize with ExternalConstraintConfig auto config = ConversationConfig::Builder() .SetEnableConstrainedDecoding(true) .SetConstraintProviderConfig(ExternalConstraintConfig()) .Build(*engine); auto conversation = Conversation::Create(*engine, config); // 2. Create your custom constraint (must implement litert::lm::Constraint) class MyCustomConstraint : public Constraint { // Implement Start, ComputeNext, etc. }; auto my_constraint = std::make_unique(); // 3. Pass it to SendMessage ExternalConstraintArg external_constraint; external_constraint.constraint = std::move(my_constraint); auto response = conversation->SendMessage( user_message, {.decoding_constraint = std::move(external_constraint)} ); ``` ## API Reference ### `ConstraintProviderConfig` A variant configuration passed to `ConversationConfig`. - `LlGuidanceConfig`: Configures LLGuidance. - `eos_id`: Optional override for the End-of-Sequence token ID. - `ExternalConstraintConfig`: Empty struct (marker) to enable external constraints. ### `ConstraintArg` A variant argument passed via `OptionalArgs` to `SendMessage`. - `LlGuidanceConstraintArg`: - `constraint_type`: `kRegex`, `kJsonSchema`, or `kLark`. - `constraint_string`: The pattern/schema/grammar string. - `ExternalConstraintArg`: - `constraint`: `std::unique_ptr`. Ownership is transferred to the valid decoder for that request.