> ## Documentation Index
> Fetch the complete documentation index at: https://docs.uselim.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Judgment Engine

> Deep dive into lim's 3-step judgment engine: Rule Match, History Match, AI Inference, and Escalate.

# Judgment Engine

The judgment engine is the core of lim's automation. It takes a business event (a bank transaction, a natural language description, an invoice) and decides how to classify it into a journal entry.

## The 4-Step Pipeline

Every transaction passes through four steps, in order. The engine stops at the first step that produces a confident result.

```
Input: { counterparty: "AWS", amount: 11000, direction: "outflow" }
    │
    ▼
┌──────────────────────────────┐
│  Step 1: RULE MATCH          │  ← Zero cost. Instant.
│  Check learned rules.        │
│  Confidence threshold: 0.85  │
└──────────┬───────────────────┘
           │ No match
           ▼
┌──────────────────────────────┐
│  Step 2: HISTORY MATCH       │  ← Zero cost. SQL lookup.
│  Find similar past entries.  │
│  Confidence threshold: 0.70  │
└──────────┬───────────────────┘
           │ No match
           ▼
┌──────────────────────────────┐
│  Step 3: AI INFERENCE        │  ← LLM API call. Last resort.
│  Ask Claude/LLM to classify. │
│  Confidence threshold: 0.60  │
└──────────┬───────────────────┘
           │ Low confidence
           ▼
┌──────────────────────────────┐
│  Step 4: ESCALATE            │  ← Human review required.
│  Flag for manual review.     │
│  Confidence: 0               │
└──────────────────────────────┘
```

## Step 1: Rule Match

The fastest and cheapest step. Rules are deterministic pattern-matching conditions stored in the `matching_rule` table.

### How Rules Work

Each rule has:

* **Conditions** -- Pattern to match (counterparty name, direction, amount range)
* **Template** -- The journal entry to produce (accounts, tax code, amounts)
* **Confidence** -- How reliable this rule is (0.00 to 0.99)
* **Auto-post threshold** -- Confidence level at which entries are posted without human confirmation (default: 0.95)

Example rule:

```json theme={null}
{
  "name": "Auto: AWS",
  "conditions": {
    "counterparty_contains": "aws",
    "direction": "outflow"
  },
  "template": {
    "lines": [
      { "side": "debit", "accountCode": "5101", "ratio": 0.909 },
      { "side": "debit", "accountCode": "1501", "ratio": 0.091 },
      { "side": "credit", "accountCode": "2101", "ratio": 1.0 }
    ]
  },
  "confidence": 0.91,
  "autoPostThreshold": 0.95
}
```

When a bank transaction with counterparty "Amazon Web Services" arrives, this rule matches on `counterparty_contains: "aws"` and produces:

```
Debit  Communication (5101)     ¥10,000  (90.9% of ¥11,000)
Debit  Input VAT (1501)          ¥1,000  (9.1% of ¥11,000)
Credit Accounts payable (2101)  ¥11,000
```

### Rule Matching Priority

When multiple rules match, they're evaluated by priority (lower number = higher priority):

| Priority | Source                                          | Example                                             |
| -------- | ----------------------------------------------- | --------------------------------------------------- |
| 1-10     | Manual rules (user-created)                     | "All Stripe payments go to Payment Processing Fees" |
| 50       | Learned rules (auto-created from confirmations) | "AWS = Communication"                               |
| 100      | Default rules (system)                          | "Unknown outflow = Miscellaneous expense"           |

## Step 2: History Match

When no rule matches, lim searches past journal entries for similar transactions.

The history matcher looks for entries with:

* Similar counterparty name (fuzzy string matching)
* Same direction (inflow/outflow)
* Similar amount range

If a match is found with sufficient similarity, it suggests the same account classification.

```
Input: { counterparty: "Amazon Web Svcs", amount: 13200 }

History search:
  Found: "Amazon Web Services" → Communication (5101) — used 3 times
  Similarity: 0.92
  Confidence: 0.78

Suggestion: Communication (5101) + Input VAT (1501)
```

<Tip>
  History matches always require human confirmation. They never auto-post, even if confidence is
  high. This is because the match is probabilistic, not deterministic.
</Tip>

## Step 3: AI Inference

When rules and history can't resolve the transaction, lim calls an LLM to classify it.

The AI receives:

* The transaction details (counterparty, amount, description, date)
* The company's chart of accounts (codes and names)
* Recent journal entry examples for context

The AI returns:

* Suggested account classification
* Confidence score
* Reasoning

```
Input: { counterparty: "WeWork", amount: 55000, description: "coworking space March" }

AI Response:
  Account: Rent expense (7101) + Input VAT (1501)
  Confidence: 0.82
  Reasoning: "Coworking space is an office rent expense. Standard 10% consumption tax applies."
```

AI inference results always require human confirmation. This is by design -- the confirmation is what powers the learning loop.

### AI Provider / Model Configuration

```bash theme={null}
# Provider (current implementation: anthropic)
export LIM_AI_PROVIDER=anthropic

# Shared model for all AI tasks
export LIM_AI_MODEL=claude-haiku-4-5-20251001

# Or task-specific overrides
export LIM_AI_MODEL_JUDGMENT=claude-haiku-4-5-20251001
export LIM_AI_MODEL_NL_JOURNAL=claude-haiku-4-5-20251001
export LIM_AI_MODEL_VISION=claude-haiku-4-5-20251001
```

The current engine switches AI runtime behavior via `LIM_AI_PROVIDER` and `LIM_AI_MODEL*`
environment variables. The `LIM_` prefix avoids collisions with generic `AI_MODEL` settings
used by other tools or services in the same environment. `vertex` and `local` are prepared as
injection points, but only the Anthropic adapter is implemented in this slice.

## Step 4: Escalate

When all three steps fail to produce a confident classification, the transaction is escalated to a human.

This happens when:

* No rule matches
* No similar history exists
* AI confidence is below 0.60
* The transaction is genuinely ambiguous

Escalated transactions appear as draft journal entries that need manual classification.

## Confidence Scores

Confidence scores range from 0.00 to 0.99 and drive automation decisions.

### How Confidence Changes

| Action          | Effect                                         | Example                                      |
| --------------- | ---------------------------------------------- | -------------------------------------------- |
| **Confirmed**   | +0.03 per confirmation (cap 0.99)              | Rule at 0.85 → 0.88 after confirmation       |
| **Edited**      | No change to confidence (template is updated)  | Rule template corrected but confidence stays |
| **Rejected**    | -0.10 per rejection                            | Rule at 0.85 → 0.75 after rejection          |
| **Deactivated** | Rule disabled when confidence drops below 0.50 | Consistently wrong rule is turned off        |

### Confidence Thresholds

| Threshold   | Behavior                                                                |
| ----------- | ----------------------------------------------------------------------- |
| >= 0.95     | **Auto-post.** Entry is created without human confirmation.             |
| 0.85 - 0.94 | **Suggest.** Entry is created as draft, shown to user for confirmation. |
| 0.60 - 0.84 | **Low confidence.** Entry is created as draft with a warning.           |
| \< 0.60     | **Escalate.** Marked for human review.                                  |

## The Learning Flywheel

The judgment engine improves through a feedback loop:

```
┌─────────────────────────────────────────────────────────┐
│                                                          │
│  Transaction arrives                                     │
│       │                                                  │
│       ▼                                                  │
│  Judgment engine classifies it                           │
│       │                                                  │
│       ├─ Rule match (auto-post if confidence >= 0.95)    │
│       │                                                  │
│       ├─ History/AI match (suggest to human)             │
│       │       │                                          │
│       │       ▼                                          │
│       │  Human confirms or edits                         │
│       │       │                                          │
│       │       ▼                                          │
│       │  learnFromConfirmedEntry()                       │
│       │       │                                          │
│       │       ├─ Existing rule? → Reinforce (+0.03)      │
│       │       │                                          │
│       │       └─ No rule? → Create new rule              │──┐
│       │           confidence: 0.85                       │  │
│       │           source: "learned"                      │  │
│       │                                                  │  │
│       └─ Escalated (human classifies from scratch)       │  │
│               │                                          │  │
│               ▼                                          │  │
│          New rule created ────────────────────────────────┘  │
│                                                              │
│  Next similar transaction → Rule match (no AI cost) ◄────────┘
│                                                          │
└──────────────────────────────────────────────────────────┘
```

### Learning in Practice

When `learnFromConfirmedEntry()` is called:

1. **Existing rule found** (same counterparty pattern) -- the rule's confidence is increased by 0.03 and the template is updated to match the confirmed entry.

2. **No existing rule** -- a new rule is created with:
   * `confidence: 0.85` (starts below auto-post threshold)
   * `source: "learned"` (distinguishes from manually created rules)
   * `priority: 50` (lower priority than manual rules)
   * Counterparty pattern derived from the confirmed entry

After 4 more confirmations (0.85 + 0.03 x 4 = 0.97), the rule crosses the auto-post threshold and future matching transactions are posted without human input.

### Cost Trajectory

For a typical company processing 300 transactions/month:

| Month | Rule Match | History | AI  | Escalated | AI Cost  |
| ----- | ---------- | ------- | --- | --------- | -------- |
| 1     | 20%        | 10%     | 50% | 20%       | \~\$15   |
| 2     | 55%        | 15%     | 20% | 10%       | \~\$6    |
| 3     | 75%        | 10%     | 10% | 5%        | \~\$3    |
| 6     | 90%        | 5%      | 3%  | 2%        | \~\$1    |
| 12    | 95%        | 3%      | 1%  | 1%        | \~\$0.30 |

<Tip>
  The key insight: AI costs are front-loaded. The system learns quickly and AI inference becomes
  rare. By month 6, most companies see less than \$1/month in AI costs for transaction
  classification.
</Tip>

## Rule Management

### Viewing Rules

```bash theme={null}
lim stats
# Shows rule count, match rates, and confidence distribution
```

### Manual Rule Creation

For transactions you know in advance, create rules manually for instant classification:

```bash theme={null}
# Via the REST API
POST /v1/companies/:id/matching-rules
{
  "name": "Stripe processing fee",
  "conditions": {
    "counterparty_contains": "stripe",
    "direction": "outflow",
    "amount_max": 50000
  },
  "template": {
    "lines": [
      { "side": "debit", "accountCode": "5401", "ratio": 1.0 },
      { "side": "credit", "accountCode": "1102", "ratio": 1.0 }
    ]
  },
  "priority": 5
}
```

Manual rules have higher priority (lower number) than learned rules, so they take precedence.

## Architecture

The judgment engine lives in the `@repo/engine` package:

```
packages/engine/src/judgment/
├── engine.ts           # Main 4-step pipeline
├── rule-matcher.ts     # Step 1: Pattern matching against rules
├── history-matcher.ts  # Step 2: Similar transaction lookup
├── ai-inferrer.ts      # Step 3: LLM classification
├── confidence.ts       # Confidence score management
├── learner.ts          # Learning from confirmed entries
└── nl-engine.ts        # Natural language input processing
```

Each step is independently testable and the pipeline is extensible -- new steps can be inserted without changing the overall flow.
