Skip to main content

Data Model

CoquiTitle data is stored in the ridpr schema in Supabase PostgreSQL.

Core Tables

cases

Main table for title study requests.

ColumnTypeDescription
case_iduuidPrimary key
org_iduuidOrganization (multi-tenant)
created_byuuidUser who created the case
finca_idtextProperty finca number
property_addresstextProperty address
demarcacion_codetextRegistry demarcation code
statustextProcessing status
error_messagetextError details if failed
document_countintegerNumber of uploaded documents
total_pagesintegerTotal pages across documents
ocr_modetextOCR processing mode
extractor_modeltextLLM model for extraction
report_modeltextLLM model for report generation
current_run_iduuidActive pipeline run ID
pending_docs_raw_jsonjsonbRaw pending presentations data
pending_docs_normalized_jsonjsonbProcessed pending docs
langfuse_trace_idtextObservability trace ID
created_attimestamptzCreation timestamp
updated_attimestamptzLast update timestamp

documents

Uploaded PDF documents for a case.

ColumnTypeDescription
doc_iduuidPrimary key
case_iduuidFK to cases
filenametextOriginal filename
s3_keytextS3 storage path
page_countintegerNumber of pages
uploaded_attimestamptzUpload timestamp

extractions

Structured data extracted from documents.

ColumnTypeDescription
extraction_iduuidPrimary key
case_iduuidFK to cases
run_iduuidPipeline run ID
schema_jsonjsonbExtracted structured data
pass1_schema_jsonjsonbPass 1 intermediate results
is_pass1_onlybooleanPartial extraction flag
evidence_metadatajsonbAlias → canonical ID mappings
title_statejsonbDerived title state (see below)
derived_current_rightsjsonbComputed ownership rights (canonical if confident)
model_versiontextLLM model used
input_tokensintegerTotal input tokens
output_tokensintegerTotal output tokens
thinking_tokensintegerThinking/reasoning tokens
extraction_duration_msintegerProcessing time

title_state Structure (tsb_v2)

The title_state column contains the deterministic derivation output from the Title State Builder:

{
"version": "tsb_v2",
"chain_of_title": [
{
"entry_type": "acquisition",
"original_index": 0,
"inscription_number": "5ta",
"date_iso": "1977-12-29",
"data": { /* original acquisition object */ }
},
{
"entry_type": "event",
"original_index": 0,
"inscription_number": "7ma",
"date_iso": null,
"data": { "event_type": "death", "person": "Santiago Delgado" }
}
],
"confidence_summary": {
"status": "confident", // or "needs_review", "unknown"
"reasons": []
},
"review_flags": [
{
"issue": "requires_partition_for_transfer_or_mortgage",
"details": "Patrimonio indiviso: requiere partición...",
"severity": "warn"
}
],
"derivation_map": {
"derived_current_rights": {
"rule": "rights_derivation_v1",
"inputs": ["acquisitions", "events", "titulares", "is_patrimonio_indiviso"]
},
"chain_of_title": {
"rule": "build_chain_of_title_v1",
"inputs": ["acquisitions", "events"]
}
}
}

Chain of Title Entry Types:

entry_typeDescription
acquisitionProperty purchase, inheritance, donation
eventNon-acquisition title impacts (death, partition, divorce)

Event Types:

event_typeDescription
deathCausante dies, triggers inheritance
partitionHeirs divide estate shares
declaratoria_de_herederosJudicial declaration of heirs
cuota_viudal_usufructuariaWidow's usufruct rights
divorceCommunity property dissolution

Confidence Status:

StatusMeaning
confidentNo review flags; canonical rights persisted
needs_reviewHas warnings but no errors
unknownHas errors or empty candidate rights

evidence_sources

Links extracted fields to source document locations.

ColumnTypeDescription
source_iduuidPrimary key
extraction_iduuidFK to extractions
case_iduuidFK to cases
run_iduuidPipeline run ID
field_pathtextJSON path (e.g., titulares[0].name)
field_valuetextExtracted value
doc_iduuidFK to documents
page_nointegerPage number
line_idtextOCR line reference
bboxesjsonbArray of bounding boxes
match_methodtextHow evidence was matched
match_confidencefloatConfidence score (0-1)

Match Methods:

MethodConfidenceDescription
exact1.0Quote found as exact substring
fuzzy_full_line0.8Normalized text match
nearby_exact1.0Found in adjacent line (±2)
nearby_fuzzy0.8Fuzzy match in adjacent line
llm0.85LLM-based token matching
approximate0.3Best-effort fallback
failed0.0No match found

reports

Generated title study reports.

ColumnTypeDescription
report_iduuidPrimary key
case_iduuidFK to cases
run_iduuidPipeline run ID
langtextReport language
report_jsonjsonbFull report structure
s3_uritextJSON storage path
pdf_s3_uritextPDF storage path
model_versiontextLLM model used
report_duration_msintegerReport gen time
report_input_tokensintegerReport input tokens
report_output_tokensintegerReport output tokens

OCR Tables

pages

Page-level OCR results.

ColumnTypeDescription
page_iduuidPrimary key
doc_iduuidFK to documents
page_nointegerPage number
widthintegerPage width in pixels
heightintegerPage height in pixels
texttextFull page text
ocr_done_attimestamptzOCR completion time

ocr_tokens

Word-level tokens with bounding boxes.

ColumnTypeDescription
token_iduuidPrimary key
doc_iduuidFK to documents
page_nointegerPage number
texttextToken text
bboxjsonb{x, y, width, height} normalized 0-1
confidencefloatOCR confidence
start_indexintegerText anchor start
end_indexintegerText anchor end
line_idtextReference to ocr_lines

ocr_lines

Line-level text for evidence resolution.

ColumnTypeDescription
line_idtextPrimary key (format: {doc_uuid}-P{page}-L{line:03d})
doc_iduuidFK to documents
page_nointegerPage number
line_nointegerLine number on page
texttextLine text
start_indexintegerText anchor start
end_indexintegerText anchor end
bboxjsonbLine bounding box

Pending Documents Tables

pending_presentations

Scraped "documentos presentados" from Karibe.

ColumnTypeDescription
presentation_iduuidPrimary key
asiento_karibetextKaribe asiento number
fincatextProperty finca number
demarcacion_codetextRegistry demarcation
doc_kindtextDocument type
pdf_urltextSource PDF URL
s3_keytextArchived PDF path
scraped_attimestamptzWhen scraped

pending_docs_cache

Cache for pending document processing.

ColumnTypeDescription
cache_keytextPrimary key (asiento + hash)
case_iduuidFK to cases
ocr_texttextExtracted OCR text
extraction_jsonjsonbExtracted structured data
processed_attimestamptzProcessing timestamp

Supporting Tables

case_events

Real-time progress events for UI updates.

ColumnTypeDescription
event_iduuidPrimary key
case_iduuidFK to cases
run_iduuidPipeline run ID
steptextPipeline step name
messagetextProgress message
progressintegerPercentage (0-100)
metadatajsonbAdditional event data
created_attimestamptzEvent timestamp

case_shares

Report sharing links.

ColumnTypeDescription
share_iduuidPrimary key
case_iduuidFK to cases
created_byuuidUser who created share
expires_attimestamptzExpiration time
revoked_attimestamptzRevocation time
view_countintegerNumber of views
last_viewed_attimestamptzLast view time

embargos

Lien and embargo records from Karibe.

note

See Embargos System for detailed documentation.

Entity Relationship Diagram

Run ID Tracking

Each pipeline execution is tagged with a run_id (UUID) to enable re-runs without polluting history:

TableColumnDescription
casescurrent_run_idThe active run for this case
extractionsrun_idWhich run produced this extraction
evidence_sourcesrun_idWhich run produced these citations
reportsrun_idWhich run produced this report
case_eventsrun_idWhich run emitted this event

When re-running a case, a new run_id is generated, and cases.current_run_id is updated. Historical data from previous runs is preserved for auditing.