Judgment Engine

The judgment engine is the core of lim’s automation. It takes a business event (a bank transaction, a natural language description, an invoice) and decides how to classify it into a journal entry.

The 4-Step Pipeline

Every transaction passes through four steps, in order. The engine stops at the first step that produces a confident result.

Input: { counterparty: "AWS", amount: 11000, direction: "outflow" }
    │
    ▼
┌──────────────────────────────┐
│  Step 1: RULE MATCH          │  ← Zero cost. Instant.
│  Check learned rules.        │
│  Confidence threshold: 0.85  │
└──────────┬───────────────────┘
           │ No match
           ▼
┌──────────────────────────────┐
│  Step 2: HISTORY MATCH       │  ← Zero cost. SQL lookup.
│  Find similar past entries.  │
│  Confidence threshold: 0.70  │
└──────────┬───────────────────┘
           │ No match
           ▼
┌──────────────────────────────┐
│  Step 3: AI INFERENCE        │  ← LLM API call. Last resort.
│  Ask Claude/LLM to classify. │
│  Confidence threshold: 0.60  │
└──────────┬───────────────────┘
           │ Low confidence
           ▼
┌──────────────────────────────┐
│  Step 4: ESCALATE            │  ← Human review required.
│  Flag for manual review.     │
│  Confidence: 0               │
└──────────────────────────────┘

Step 1: Rule Match

The fastest and cheapest step. Rules are deterministic pattern-matching conditions stored in the matching_rule table.

How Rules Work

Each rule has:

Conditions — Pattern to match (counterparty name, direction, amount range)
Template — The journal entry to produce (accounts, tax code, amounts)
Confidence — How reliable this rule is (0.00 to 0.99)
Auto-post threshold — Confidence level at which entries are posted without human confirmation (default: 0.95)

Example rule:

{
  "name": "Auto: AWS",
  "conditions": {
    "counterparty_contains": "aws",
    "direction": "outflow"
  },
  "template": {
    "lines": [
      { "side": "debit", "accountCode": "5101", "ratio": 0.909 },
      { "side": "debit", "accountCode": "1501", "ratio": 0.091 },
      { "side": "credit", "accountCode": "2101", "ratio": 1.0 }
    ]
  },
  "confidence": 0.91,
  "autoPostThreshold": 0.95
}

When a bank transaction with counterparty “Amazon Web Services” arrives, this rule matches on counterparty_contains: "aws" and produces:

Debit  Communication (5101)     ¥10,000  (90.9% of ¥11,000)
Debit  Input VAT (1501)          ¥1,000  (9.1% of ¥11,000)
Credit Accounts payable (2101)  ¥11,000

Rule Matching Priority

When multiple rules match, they’re evaluated by priority (lower number = higher priority):

Priority	Source	Example
1-10	Manual rules (user-created)	“All Stripe payments go to Payment Processing Fees”
50	Learned rules (auto-created from confirmations)	“AWS = Communication”
100	Default rules (system)	“Unknown outflow = Miscellaneous expense”

Step 2: History Match

When no rule matches, lim searches past journal entries for similar transactions. The history matcher looks for entries with:

Similar counterparty name (fuzzy string matching)
Same direction (inflow/outflow)
Similar amount range

If a match is found with sufficient similarity, it suggests the same account classification.

Input: { counterparty: "Amazon Web Svcs", amount: 13200 }

History search:
  Found: "Amazon Web Services" → Communication (5101) — used 3 times
  Similarity: 0.92
  Confidence: 0.78

Suggestion: Communication (5101) + Input VAT (1501)

History matches always require human confirmation. They never auto-post, even if confidence is high. This is because the match is probabilistic, not deterministic.

Step 3: AI Inference

When rules and history can’t resolve the transaction, lim calls an LLM to classify it. The AI receives:

The transaction details (counterparty, amount, description, date)
The company’s chart of accounts (codes and names)
Recent journal entry examples for context

The AI returns:

Suggested account classification
Confidence score
Reasoning

Input: { counterparty: "WeWork", amount: 55000, description: "coworking space March" }

AI Response:
  Account: Rent expense (7101) + Input VAT (1501)
  Confidence: 0.82
  Reasoning: "Coworking space is an office rent expense. Standard 10% consumption tax applies."

AI inference results always require human confirmation. This is by design — the confirmation is what powers the learning loop.

AI Provider / Model Configuration

# Provider (current implementation: anthropic)
export LIM_AI_PROVIDER=anthropic

# Shared model for all AI tasks
export LIM_AI_MODEL=claude-haiku-4-5-20251001

# Or task-specific overrides
export LIM_AI_MODEL_JUDGMENT=claude-haiku-4-5-20251001
export LIM_AI_MODEL_NL_JOURNAL=claude-haiku-4-5-20251001
export LIM_AI_MODEL_VISION=claude-haiku-4-5-20251001

The current engine switches AI runtime behavior via LIM_AI_PROVIDER and LIM_AI_MODEL* environment variables. The LIM_ prefix avoids collisions with generic AI_MODEL settings used by other tools or services in the same environment. vertex and local are prepared as injection points, but only the Anthropic adapter is implemented in this slice.

Step 4: Escalate

When all three steps fail to produce a confident classification, the transaction is escalated to a human. This happens when:

No rule matches
No similar history exists
AI confidence is below 0.60
The transaction is genuinely ambiguous

Escalated transactions appear as draft journal entries that need manual classification.

Confidence Scores

Confidence scores range from 0.00 to 0.99 and drive automation decisions.

How Confidence Changes

Action	Effect	Example
Confirmed	+0.03 per confirmation (cap 0.99)	Rule at 0.85 → 0.88 after confirmation
Edited	No change to confidence (template is updated)	Rule template corrected but confidence stays
Rejected	-0.10 per rejection	Rule at 0.85 → 0.75 after rejection
Deactivated	Rule disabled when confidence drops below 0.50	Consistently wrong rule is turned off

Confidence Thresholds

Threshold	Behavior
>= 0.95	Auto-post. Entry is created without human confirmation.
0.85 - 0.94	Suggest. Entry is created as draft, shown to user for confirmation.
0.60 - 0.84	Low confidence. Entry is created as draft with a warning.
< 0.60	Escalate. Marked for human review.

The Learning Flywheel

The judgment engine improves through a feedback loop:

┌─────────────────────────────────────────────────────────┐
│                                                          │
│  Transaction arrives                                     │
│       │                                                  │
│       ▼                                                  │
│  Judgment engine classifies it                           │
│       │                                                  │
│       ├─ Rule match (auto-post if confidence >= 0.95)    │
│       │                                                  │
│       ├─ History/AI match (suggest to human)             │
│       │       │                                          │
│       │       ▼                                          │
│       │  Human confirms or edits                         │
│       │       │                                          │
│       │       ▼                                          │
│       │  learnFromConfirmedEntry()                       │
│       │       │                                          │
│       │       ├─ Existing rule? → Reinforce (+0.03)      │
│       │       │                                          │
│       │       └─ No rule? → Create new rule              │──┐
│       │           confidence: 0.85                       │  │
│       │           source: "learned"                      │  │
│       │                                                  │  │
│       └─ Escalated (human classifies from scratch)       │  │
│               │                                          │  │
│               ▼                                          │  │
│          New rule created ────────────────────────────────┘  │
│                                                              │
│  Next similar transaction → Rule match (no AI cost) ◄────────┘
│                                                          │
└──────────────────────────────────────────────────────────┘

Learning in Practice

When learnFromConfirmedEntry() is called:

Existing rule found (same counterparty pattern) — the rule’s confidence is increased by 0.03 and the template is updated to match the confirmed entry.
No existing rule — a new rule is created with:
- confidence: 0.85 (starts below auto-post threshold)
- source: "learned" (distinguishes from manually created rules)
- priority: 50 (lower priority than manual rules)
- Counterparty pattern derived from the confirmed entry

After 4 more confirmations (0.85 + 0.03 x 4 = 0.97), the rule crosses the auto-post threshold and future matching transactions are posted without human input.

Cost Trajectory

For a typical company processing 300 transactions/month:

Month	Rule Match	History	AI	Escalated	AI Cost
1	20%	10%	50%	20%	~$15
2	55%	15%	20%	10%	~$6
3	75%	10%	10%	5%	~$3
6	90%	5%	3%	2%	~$1
12	95%	3%	1%	1%	~$0.30

The key insight: AI costs are front-loaded. The system learns quickly and AI inference becomes rare. By month 6, most companies see less than $1/month in AI costs for transaction classification.

Rule Management

Viewing Rules

lim stats
# Shows rule count, match rates, and confidence distribution

Manual Rule Creation

For transactions you know in advance, create rules manually for instant classification:

# Via the REST API
POST /v1/companies/:id/matching-rules
{
  "name": "Stripe processing fee",
  "conditions": {
    "counterparty_contains": "stripe",
    "direction": "outflow",
    "amount_max": 50000
  },
  "template": {
    "lines": [
      { "side": "debit", "accountCode": "5401", "ratio": 1.0 },
      { "side": "credit", "accountCode": "1102", "ratio": 1.0 }
    ]
  },
  "priority": 5
}

Manual rules have higher priority (lower number) than learned rules, so they take precedence.

Architecture

The judgment engine lives in the @repo/engine package:

packages/engine/src/judgment/
├── engine.ts           # Main 4-step pipeline
├── rule-matcher.ts     # Step 1: Pattern matching against rules
├── history-matcher.ts  # Step 2: Similar transaction lookup
├── ai-inferrer.ts      # Step 3: LLM classification
├── confidence.ts       # Confidence score management
├── learner.ts          # Learning from confirmed entries
└── nl-engine.ts        # Natural language input processing

Each step is independently testable and the pipeline is extensible — new steps can be inserted without changing the overall flow.

​Judgment Engine

​The 4-Step Pipeline

​Step 1: Rule Match

​How Rules Work

​Rule Matching Priority

​Step 2: History Match

​Step 3: AI Inference

​AI Provider / Model Configuration

​Step 4: Escalate

​Confidence Scores

​How Confidence Changes

​Confidence Thresholds

​The Learning Flywheel

​Learning in Practice

​Cost Trajectory

​Rule Management

​Viewing Rules

​Manual Rule Creation

​Architecture

Judgment Engine

The 4-Step Pipeline

Step 1: Rule Match

How Rules Work

Rule Matching Priority

Step 2: History Match

Step 3: AI Inference

AI Provider / Model Configuration

Step 4: Escalate

Confidence Scores

How Confidence Changes

Confidence Thresholds

The Learning Flywheel

Learning in Practice

Cost Trajectory

Rule Management

Viewing Rules

Manual Rule Creation

Architecture