File size: 5,368 Bytes
5f923cd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
# Constrained Decoding in LiteRT-LM

LiteRT-LM supports constrained decoding, allowing you to enforce specific
structures on the model's output. This is particularly useful for tasks like:

-   **Function Calling**: Ensuring the model outputs a valid function call
    matching a specific schema.
-   **Structured Data Extraction**: Forcing the model to adhere to a specific
    format (e.g., specific regex patterns).
-   **Grammar Enforcement**: Using context-free grammars (via Lark) to guide
    generation.

This document explains how to enable, configure, and use constrained decoding in
your application.

## Enabling Constrained Decoding

To use constrained decoding, you must enable it in the `ConversationConfig` when
creating your `Conversation` instance.

```cpp
#include "runtime/conversation/conversation.h"

// ...

ConversationConfig::Builder builder;
builder.SetEnableConstrainedDecoding(true);

// Set a ConstraintProviderConfig in the ConversationConfig::Builder.
// This line set the ConstraintProvider to LLGuidance with default settings.
builder.SetConstraintProviderConfig(LlGuidanceConfig());

auto config = builder.Build(*engine);
```

### Constraint Providers

LiteRT-LM supports different backends for constrained decoding, configured via
`ConstraintProviderConfig`:

1.  **LLGuidance (`LlGuidanceConfig`)**: Uses the
    [LLGuidance](https://github.com/guidance-ai/llguidance) library. Supports
    Regex, JSON Schema, and Lark grammars.
2.  **External (`ExternalConstraintConfig`)**: Allows passing a pre-constructed
    `Constraint` object per-request. Useful for custom C++ constraint
    implementations.

## Using Constraints in `SendMessage`

Once enabled, you can apply constraints to individual messages using the
`decoding_constraint` field in the `OptionalArgs` struct passed to `SendMessage`
or `SendMessageAsync`. This field is of type `std::optional<ConstraintArg>`.

### 1. LLGuidance Constraints

LLGuidance constraints can be specified as Regex, JSON Schema, or Lark grammars.

#### Regex Constraint

Constrain the output to match a regular expression.

```cpp
#include "runtime/components/constrained_decoding/llg_constraint_config.h"

// ...

LlGuidanceConstraintArg constraint_arg;
constraint_arg.constraint_type = LlgConstraintType::kRegex;
// Example: Force output to be a sequence of 'a's followed by 'b's
constraint_arg.constraint_string = "a+b+";

auto response = conversation->SendMessage(
    user_message,
    {.decoding_constraint = constraint_arg}
);
```

#### JSON Schema Constraint

Constrain the output to be a valid JSON object matching a schema.

```cpp
LlGuidanceConstraintArg constraint_arg;
constraint_arg.constraint_type = LlgConstraintType::kJsonSchema;
// Example: Simple JSON object with a "name" field
constraint_arg.constraint_string = R"({
  "type": "object",
  "properties": {
    "name": {"type": "string"}
  },
  "required": ["name"]
})";

auto response = conversation->SendMessage(
    user_message,
    {.decoding_constraint = constraint_arg}
);
```

#### Lark Grammar Constraint

Constrain the output to follow a Lark grammar.

```cpp
LlGuidanceConstraintArg constraint_arg;
constraint_arg.constraint_type = LlgConstraintType::kLark;
// Example: A simple calculator grammar
constraint_arg.constraint_string = R"(
    start: expr
    expr: atom
        | expr "+" atom
        | expr "-" atom
        | expr "*" atom
        | expr "/" atom
        | "(" expr ")"
    atom: /[0-9]+/
    WS: /[ \t\n\f]+/
    %ignore WS
)";

auto response = conversation->SendMessage(
    user_message,
    {.decoding_constraint = constraint_arg}
);
```

### 2. External Constraints

If you have a custom implementation of the `Constraint` interface (e.g., a
highly specialized C++ state machine), you can use `ExternalConstraintArg`.

Prerequisite: You must have initialized `Conversation` with
`ExternalConstraintConfig`.

```cpp
// 1. Initialize with ExternalConstraintConfig
auto config = ConversationConfig::Builder()
    .SetEnableConstrainedDecoding(true)
    .SetConstraintProviderConfig(ExternalConstraintConfig())
    .Build(*engine);
auto conversation = Conversation::Create(*engine, config);

// 2. Create your custom constraint (must implement litert::lm::Constraint)
class MyCustomConstraint : public Constraint {
    // Implement Start, ComputeNext, etc.
};
auto my_constraint = std::make_unique<MyCustomConstraint>();

// 3. Pass it to SendMessage
ExternalConstraintArg external_constraint;
external_constraint.constraint = std::move(my_constraint);

auto response = conversation->SendMessage(
    user_message,
    {.decoding_constraint = std::move(external_constraint)}
);
```

## API Reference

### `ConstraintProviderConfig`

A variant configuration passed to `ConversationConfig`.

-   `LlGuidanceConfig`: Configures LLGuidance.
    -   `eos_id`: Optional override for the End-of-Sequence token ID.
-   `ExternalConstraintConfig`: Empty struct (marker) to enable external
    constraints.

### `ConstraintArg`

A variant argument passed via `OptionalArgs` to `SendMessage`.

-   `LlGuidanceConstraintArg`:
    -   `constraint_type`: `kRegex`, `kJsonSchema`, or `kLark`.
    -   `constraint_string`: The pattern/schema/grammar string.
-   `ExternalConstraintArg`:
    -   `constraint`: `std::unique_ptr<Constraint>`. Ownership is transferred to
        the valid decoder for that request.