ADR-002: Deterministic Title State

Status: Accepted

Date: 2025-01-20

Decision Makers: Engineering Team

Related Issues: ALI-20

Context

CoquiTitle needs to determine the "title state" of a property - whether the title is clear, has defects, or has issues that need resolution. This determination affects:

Whether a property can close
What remediation steps are needed
Risk assessment for lenders

We considered using LLM inference to determine title state, but this created problems:

Non-deterministic results (same inputs could yield different states)
Difficult to explain why a title was marked as having defects
Hard to test and validate

Decision

Calculate title state using deterministic, rule-based logic applied to extracted data.

The title state engine:

Takes extraction data as input
Applies a defined set of rules
Returns a title state with specific defect codes
Provides explanations for each defect

Rationale

Reproducibility: Given the same extraction data, the title state will always be the same. This is critical for legal/financial decisions.

Auditability: Each defect has a specific rule that triggered it. Auditors can trace exactly why a title was flagged.

Testability: We can write unit tests for specific rule scenarios. The rule engine is fully deterministic and testable.

Explainability: Users see exactly which rules triggered which defects, not an opaque LLM decision.

Alternatives Considered

Alternative A: LLM-based Title State

Description: Use LLM to analyze extractions and determine title state.

Pros:

More flexible for edge cases
Can reason about complex scenarios
Less code to maintain

Cons:

Non-deterministic
Hard to explain decisions
Can't unit test effectively
May miss obvious issues or hallucinate problems

Why not chosen: Title state decisions have legal implications; they must be reproducible and explainable.

Alternative B: Hybrid Approach

Description: Use rules for clear cases, LLM for ambiguous cases.

Pros:

Best of both worlds
LLM handles edge cases

Cons:

Complexity of two systems
Still non-deterministic for edge cases
Harder to test

Why not chosen: The boundary between "clear" and "ambiguous" is itself ambiguous. Better to have one consistent system.

Consequences

Positive

100% reproducible title state calculations
Full auditability with defect codes and rule references
Comprehensive test coverage of rule engine
Clear documentation of what triggers each defect

Negative

Must maintain rule definitions
New defect types require code changes
Edge cases need explicit handling

Neutral

Separates extraction (LLM) from decision-making (rules)

Implementation

Rule Definition Format

TITLE_RULES = [
    {
        "id": "MISSING_OWNER",
        "description": "No owner name found in extractions",
        "condition": lambda e: not e.get("owner_name"),
        "severity": "critical",
        "defect_code": "TD001"
    },
    {
        "id": "ACTIVE_EMBARGO",
        "description": "Active embargo found on property",
        "condition": lambda e: any(
            emb.get("status") == "active"
            for emb in e.get("embargos", [])
        ),
        "severity": "critical",
        "defect_code": "TD002"
    },
    # ... more rules
]

Title State Calculation

def calculate_title_state(extractions: dict) -> TitleState:
    defects = []
    for rule in TITLE_RULES:
        if rule["condition"](extractions):
            defects.append({
                "code": rule["defect_code"],
                "description": rule["description"],
                "severity": rule["severity"]
            })

    if any(d["severity"] == "critical" for d in defects):
        state = "defective"
    elif defects:
        state = "issues"
    else:
        state = "clear"

    return TitleState(state=state, defects=defects)

Context​

Decision​

Rationale​

Alternatives Considered​

Alternative A: LLM-based Title State​

Alternative B: Hybrid Approach​

Consequences​

Positive​

Negative​

Neutral​

Implementation​

Rule Definition Format​

Title State Calculation​

References​