ADR-002: Deterministic Title State
Status: Accepted
Date: 2025-01-20
Decision Makers: Engineering Team
Related Issues: ALI-20
Context
CoquiTitle needs to determine the "title state" of a property - whether the title is clear, has defects, or has issues that need resolution. This determination affects:
- Whether a property can close
- What remediation steps are needed
- Risk assessment for lenders
We considered using LLM inference to determine title state, but this created problems:
- Non-deterministic results (same inputs could yield different states)
- Difficult to explain why a title was marked as having defects
- Hard to test and validate
Decision
Calculate title state using deterministic, rule-based logic applied to extracted data.
The title state engine:
- Takes extraction data as input
- Applies a defined set of rules
- Returns a title state with specific defect codes
- Provides explanations for each defect
Rationale
Reproducibility: Given the same extraction data, the title state will always be the same. This is critical for legal/financial decisions.
Auditability: Each defect has a specific rule that triggered it. Auditors can trace exactly why a title was flagged.
Testability: We can write unit tests for specific rule scenarios. The rule engine is fully deterministic and testable.
Explainability: Users see exactly which rules triggered which defects, not an opaque LLM decision.
Alternatives Considered
Alternative A: LLM-based Title State
Description: Use LLM to analyze extractions and determine title state.
Pros:
- More flexible for edge cases
- Can reason about complex scenarios
- Less code to maintain
Cons:
- Non-deterministic
- Hard to explain decisions
- Can't unit test effectively
- May miss obvious issues or hallucinate problems
Why not chosen: Title state decisions have legal implications; they must be reproducible and explainable.
Alternative B: Hybrid Approach
Description: Use rules for clear cases, LLM for ambiguous cases.
Pros:
- Best of both worlds
- LLM handles edge cases
Cons:
- Complexity of two systems
- Still non-deterministic for edge cases
- Harder to test
Why not chosen: The boundary between "clear" and "ambiguous" is itself ambiguous. Better to have one consistent system.
Consequences
Positive
- 100% reproducible title state calculations
- Full auditability with defect codes and rule references
- Comprehensive test coverage of rule engine
- Clear documentation of what triggers each defect
Negative
- Must maintain rule definitions
- New defect types require code changes
- Edge cases need explicit handling
Neutral
- Separates extraction (LLM) from decision-making (rules)
Implementation
Rule Definition Format
TITLE_RULES = [
{
"id": "MISSING_OWNER",
"description": "No owner name found in extractions",
"condition": lambda e: not e.get("owner_name"),
"severity": "critical",
"defect_code": "TD001"
},
{
"id": "ACTIVE_EMBARGO",
"description": "Active embargo found on property",
"condition": lambda e: any(
emb.get("status") == "active"
for emb in e.get("embargos", [])
),
"severity": "critical",
"defect_code": "TD002"
},
# ... more rules
]
Title State Calculation
def calculate_title_state(extractions: dict) -> TitleState:
defects = []
for rule in TITLE_RULES:
if rule["condition"](extractions):
defects.append({
"code": rule["defect_code"],
"description": rule["description"],
"severity": rule["severity"]
})
if any(d["severity"] == "critical" for d in defects):
state = "defective"
elif defects:
state = "issues"
else:
state = "clear"
return TitleState(state=state, defects=defects)