Skip to main content

Report Generation

The Report Generator produces structured JSON title study reports using a 3-pass architecture for optimal parallelization and annotation quality.

Lambda: coquititle-report-generator Model: Configurable per case via cases.report_model (default: Gemini 2.5 Flash)

3-Pass Architecture

PassPurposeParallelization
Pass 1Header + boundaries (structured data)Parallel with Pass 2
Pass 2Prose sections (clean, no annotations)Parallel with Pass 1
Pass 3Annotations (*_annotated versions)6 parallel workers per section

Pass 1: Header + Boundaries

Generates structured metadata about the property:

{
"header": {
"finca_number": "12345",
"inscription_system": "karibe",
"demarcacion": "CA0205",
"registry_section": "II de Caguas"
},
"boundaries": {
"norte": "parcela número 23",
"sur": "Calle número Cuatro",
"este": "parcela número 25",
"oeste": "parcela número 19"
}
}

Pass 2: Prose Sections

Generates clean prose without annotations:

SectionContent
description_proseProperty description with integrated boundaries
origin_proseOrigin of title (segregation history)
ownership_proseCurrent ownership and acquisition chain
encumbrancesMortgages, liens, servitudes
pending_prosePending presentations from Karibe
observations_proseEmbargos, uncertainties, review flags

Prose Style Guidelines

The report follows Puerto Rico title study conventions:

"RÚSTICA: Parcela de terreno marcada con el número veintidós (22)
de la comunidad rural denominada Los Pollos situada en el Barrio
de Pollos municipio de Patillas, Puerto Rico, con una cabida de
ochocientos ochenta y nueve metros cuadrados (889.03 m.c.), y en
lindes por el NORTE, con la parcela número veintitrés..."

Key conventions:

  • Numbers written in words with digits in parentheses
  • Measurements include decimals
  • Boundaries integrated into single paragraph
  • Professional legal Spanish

Pass 3: Citation Annotations

Adds <span data-field="..."> markers to prose for evidence linking:

Input (from Pass 2)

"Inscrito a favor de Juan Pérez García, casado..."

Output (Pass 3)

"Inscrito a favor de <span data-field="titulares[0].name">Juan Pérez
García</span>, <span data-field="titulares[0].marital_status">casado</span>..."

Parallel Annotation

Pass 3 processes sections in parallel with 6 workers:

with ThreadPoolExecutor(max_workers=6) as executor:
futures = {
executor.submit(annotate_section, 'description', ...): 'description',
executor.submit(annotate_section, 'ownership', ...): 'ownership',
executor.submit(annotate_section, 'origin', ...): 'origin',
# ... etc
}

Annotation Rules

  1. Field paths must be exact: titulares[0].name, not owner_name
  2. Arrays use zero-based indexing: acquisitions[0], acquisitions[1]
  3. No nested spans: Each span is an atomic fact
  4. Only data-field attribute: No onclick, style, class, etc.
  5. Match text exactly: Annotated text must be identical to original

Sections Annotated

FieldAnnotated Version
description_prosedescription_prose_annotated
ownership_proseownership_prose_annotated
origin_proseorigin_prose_annotated
observations_proseobservations_prose_annotated
encumbrances[].textencumbrances[].text_annotated

Encumbrances Structure

Encumbrances are split by origin:

{
"encumbrances": {
"by_origin": [
{
"text": "Servidumbre a favor de PREPA para líneas eléctricas.",
"text_annotated": "<span data-field=\"servitudes[0].description\">...</span>"
}
],
"direct": [
{
"text": "Hipoteca a favor de Banco Popular, por $150,000...",
"text_annotated": "Hipoteca a favor de <span data-field=\"encumbrances[0].holder\">Banco Popular</span>..."
}
],
"by_origin_free": false,
"direct_free": false
}
}

Error Handling

Pass 3 Failures

If annotation fails for a section:

  1. Fall back to unannotated prose
  2. Add review flag for human review
  3. Continue with other sections
if pass3_annotations is None:
# Use prose as annotated (no spans)
report['description_prose_annotated'] = report.get('description_prose')
review_flags.append("annotation_pass_failed")

Review Flags

The report includes review_flags for human attention:

{
"review_flags": [
{
"issue": "Missing seller information",
"details": "Acquisition #2 has no seller name in source documents"
}
]
}

Output Format

Final report JSON structure:

{
"header": { ... },
"boundaries": { ... },
"description_prose": "...",
"description_prose_annotated": "...<span>...</span>...",
"origin_prose": "...",
"origin_prose_annotated": "...",
"ownership_prose": "...",
"ownership_prose_annotated": "...",
"encumbrances": {
"by_origin": [...],
"direct": [...],
"by_origin_free": false,
"direct_free": false
},
"pending_prose": "...",
"observations_prose": "...",
"observations_prose_annotated": "...",
"additional_sections": [...],
"review_flags": [...]
}

Storage

Reports are stored in:

  • reports table: Metadata and report_json column
  • S3: Full JSON at s3_uri path
  • PDF (optional): Generated PDF at pdf_s3_uri

Performance

MetricValue
Pass 1+2 parallel time~15-20s
Pass 3 parallel time~10-15s
Total generation time~30-45s
Streaming enabledYes (for real-time progress)