Skip to main content

Judgment Engine

The judgment engine is the core of lim’s automation. It takes a business event (a bank transaction, a natural language description, an invoice) and decides how to classify it into a journal entry.

The 4-Step Pipeline

Every transaction passes through four steps, in order. The engine stops at the first step that produces a confident result.
Input: { counterparty: "AWS", amount: 11000, direction: "outflow" }


┌──────────────────────────────┐
│  Step 1: RULE MATCH          │  ← Zero cost. Instant.
│  Check learned rules.        │
│  Confidence threshold: 0.85  │
└──────────┬───────────────────┘
           │ No match

┌──────────────────────────────┐
│  Step 2: HISTORY MATCH       │  ← Zero cost. SQL lookup.
│  Find similar past entries.  │
│  Confidence threshold: 0.70  │
└──────────┬───────────────────┘
           │ No match

┌──────────────────────────────┐
│  Step 3: AI INFERENCE        │  ← LLM API call. Last resort.
│  Ask Claude/LLM to classify. │
│  Confidence threshold: 0.60  │
└──────────┬───────────────────┘
           │ Low confidence

┌──────────────────────────────┐
│  Step 4: ESCALATE            │  ← Human review required.
│  Flag for manual review.     │
│  Confidence: 0               │
└──────────────────────────────┘

Step 1: Rule Match

The fastest and cheapest step. Rules are deterministic pattern-matching conditions stored in the matching_rule table.

How Rules Work

Each rule has:
  • Conditions — Pattern to match (counterparty name, direction, amount range)
  • Template — The journal entry to produce (accounts, tax code, amounts)
  • Confidence — How reliable this rule is (0.00 to 0.99)
  • Auto-post threshold — Confidence level at which entries are posted without human confirmation (default: 0.95)
Example rule:
{
  "name": "Auto: AWS",
  "conditions": {
    "counterparty_contains": "aws",
    "direction": "outflow"
  },
  "template": {
    "lines": [
      { "side": "debit",  "accountCode": "5101", "ratio": 0.909 },
      { "side": "debit",  "accountCode": "1501", "ratio": 0.091 },
      { "side": "credit", "accountCode": "2101", "ratio": 1.0 }
    ]
  },
  "confidence": 0.91,
  "autoPostThreshold": 0.95
}
When a bank transaction with counterparty “Amazon Web Services” arrives, this rule matches on counterparty_contains: "aws" and produces:
Debit  Communication (5101)     ¥10,000  (90.9% of ¥11,000)
Debit  Input VAT (1501)          ¥1,000  (9.1% of ¥11,000)
Credit Accounts payable (2101)  ¥11,000

Rule Matching Priority

When multiple rules match, they’re evaluated by priority (lower number = higher priority):
PrioritySourceExample
1-10Manual rules (user-created)“All Stripe payments go to Payment Processing Fees”
50Learned rules (auto-created from confirmations)“AWS = Communication”
100Default rules (system)“Unknown outflow = Miscellaneous expense”

Step 2: History Match

When no rule matches, lim searches past journal entries for similar transactions. The history matcher looks for entries with:
  • Similar counterparty name (fuzzy string matching)
  • Same direction (inflow/outflow)
  • Similar amount range
If a match is found with sufficient similarity, it suggests the same account classification.
Input: { counterparty: "Amazon Web Svcs", amount: 13200 }

History search:
  Found: "Amazon Web Services" → Communication (5101) — used 3 times
  Similarity: 0.92
  Confidence: 0.78

Suggestion: Communication (5101) + Input VAT (1501)
History matches always require human confirmation. They never auto-post, even if confidence is high. This is because the match is probabilistic, not deterministic.

Step 3: AI Inference

When rules and history can’t resolve the transaction, lim calls an LLM to classify it. The AI receives:
  • The transaction details (counterparty, amount, description, date)
  • The company’s chart of accounts (codes and names)
  • Recent journal entry examples for context
The AI returns:
  • Suggested account classification
  • Confidence score
  • Reasoning
Input: { counterparty: "WeWork", amount: 55000, description: "coworking space March" }

AI Response:
  Account: Rent expense (7101) + Input VAT (1501)
  Confidence: 0.82
  Reasoning: "Coworking space is an office rent expense. Standard 10% consumption tax applies."
AI inference results always require human confirmation. This is by design — the confirmation is what powers the learning loop.

Step 4: Escalate

When all three steps fail to produce a confident classification, the transaction is escalated to a human. This happens when:
  • No rule matches
  • No similar history exists
  • AI confidence is below 0.60
  • The transaction is genuinely ambiguous
Escalated transactions appear as draft journal entries that need manual classification.

Confidence Scores

Confidence scores range from 0.00 to 0.99 and drive automation decisions.

How Confidence Changes

ActionEffectExample
Confirmed+0.03 per confirmation (cap 0.99)Rule at 0.85 → 0.88 after confirmation
EditedNo change to confidence (template is updated)Rule template corrected but confidence stays
Rejected-0.10 per rejectionRule at 0.85 → 0.75 after rejection
DeactivatedRule disabled when confidence drops below 0.50Consistently wrong rule is turned off

Confidence Thresholds

ThresholdBehavior
>= 0.95Auto-post. Entry is created without human confirmation.
0.85 - 0.94Suggest. Entry is created as draft, shown to user for confirmation.
0.60 - 0.84Low confidence. Entry is created as draft with a warning.
< 0.60Escalate. Marked for human review.

The Learning Flywheel

The judgment engine improves through a feedback loop:
┌─────────────────────────────────────────────────────────┐
│                                                          │
│  Transaction arrives                                     │
│       │                                                  │
│       ▼                                                  │
│  Judgment engine classifies it                           │
│       │                                                  │
│       ├─ Rule match (auto-post if confidence >= 0.95)    │
│       │                                                  │
│       ├─ History/AI match (suggest to human)             │
│       │       │                                          │
│       │       ▼                                          │
│       │  Human confirms or edits                         │
│       │       │                                          │
│       │       ▼                                          │
│       │  learnFromConfirmedEntry()                       │
│       │       │                                          │
│       │       ├─ Existing rule? → Reinforce (+0.03)      │
│       │       │                                          │
│       │       └─ No rule? → Create new rule              │──┐
│       │           confidence: 0.85                       │  │
│       │           source: "learned"                      │  │
│       │                                                  │  │
│       └─ Escalated (human classifies from scratch)       │  │
│               │                                          │  │
│               ▼                                          │  │
│          New rule created ────────────────────────────────┘  │
│                                                              │
│  Next similar transaction → Rule match (no AI cost) ◄────────┘
│                                                          │
└──────────────────────────────────────────────────────────┘

Learning in Practice

When learnFromConfirmedEntry() is called:
  1. Existing rule found (same counterparty pattern) — the rule’s confidence is increased by 0.03 and the template is updated to match the confirmed entry.
  2. No existing rule — a new rule is created with:
    • confidence: 0.85 (starts below auto-post threshold)
    • source: "learned" (distinguishes from manually created rules)
    • priority: 50 (lower priority than manual rules)
    • Counterparty pattern derived from the confirmed entry
After 4 more confirmations (0.85 + 0.03 x 4 = 0.97), the rule crosses the auto-post threshold and future matching transactions are posted without human input.

Cost Trajectory

For a typical company processing 300 transactions/month:
MonthRule MatchHistoryAIEscalatedAI Cost
120%10%50%20%~$15
255%15%20%10%~$6
375%10%10%5%~$3
690%5%3%2%~$1
1295%3%1%1%~$0.30
The key insight: AI costs are front-loaded. The system learns quickly and AI inference becomes rare. By month 6, most companies see less than $1/month in AI costs for transaction classification.

Rule Management

Viewing Rules

lim stats
# Shows rule count, match rates, and confidence distribution

Manual Rule Creation

For transactions you know in advance, create rules manually for instant classification:
# Via the REST API
POST /v1/companies/:id/matching-rules
{
  "name": "Stripe processing fee",
  "conditions": {
    "counterparty_contains": "stripe",
    "direction": "outflow",
    "amount_max": 50000
  },
  "template": {
    "lines": [
      { "side": "debit", "accountCode": "5401", "ratio": 1.0 },
      { "side": "credit", "accountCode": "1102", "ratio": 1.0 }
    ]
  },
  "priority": 5
}
Manual rules have higher priority (lower number) than learned rules, so they take precedence.

Architecture

The judgment engine lives in the @repo/engine package:
packages/engine/src/judgment/
├── engine.ts           # Main 4-step pipeline
├── rule-matcher.ts     # Step 1: Pattern matching against rules
├── history-matcher.ts  # Step 2: Similar transaction lookup
├── ai-inferrer.ts      # Step 3: LLM classification
├── confidence.ts       # Confidence score management
├── learner.ts          # Learning from confirmed entries
└── nl-engine.ts        # Natural language input processing
Each step is independently testable and the pipeline is extensible — new steps can be inserted without changing the overall flow.