Report Generation
The Report Generator produces structured JSON title study reports using a 3-pass architecture for optimal parallelization and annotation quality.
Lambda: coquititle-report-generator
Model: Configurable per case via cases.report_model (default: Gemini 2.5 Flash)
3-Pass Architecture
| Pass | Purpose | Parallelization |
|---|---|---|
| Pass 1 | Header + boundaries (structured data) | Parallel with Pass 2 |
| Pass 2 | Prose sections (clean, no annotations) | Parallel with Pass 1 |
| Pass 3 | Annotations (*_annotated versions) | 6 parallel workers per section |
Pass 1: Header + Boundaries
Generates structured metadata about the property:
{
"header": {
"finca_number": "12345",
"inscription_system": "karibe",
"demarcacion": "CA0205",
"registry_section": "II de Caguas"
},
"boundaries": {
"norte": "parcela número 23",
"sur": "Calle número Cuatro",
"este": "parcela número 25",
"oeste": "parcela número 19"
}
}
Pass 2: Prose Sections
Generates clean prose without annotations:
| Section | Content |
|---|---|
description_prose | Property description with integrated boundaries |
origin_prose | Origin of title (segregation history) |
ownership_prose | Current ownership and acquisition chain |
encumbrances | Mortgages, liens, servitudes |
pending_prose | Pending presentations from Karibe |
observations_prose | Embargos, uncertainties, review flags |
Prose Style Guidelines
The report follows Puerto Rico title study conventions:
"RÚSTICA: Parcela de terreno marcada con el número veintidós (22)
de la comunidad rural denominada Los Pollos situada en el Barrio
de Pollos municipio de Patillas, Puerto Rico, con una cabida de
ochocientos ochenta y nueve metros cuadrados (889.03 m.c.), y en
lindes por el NORTE, con la parcela número veintitrés..."
Key conventions:
- Numbers written in words with digits in parentheses
- Measurements include decimals
- Boundaries integrated into single paragraph
- Professional legal Spanish
Pass 3: Citation Annotations
Adds <span data-field="..."> markers to prose for evidence linking:
Input (from Pass 2)
"Inscrito a favor de Juan Pérez García, casado..."
Output (Pass 3)
"Inscrito a favor de <span data-field="titulares[0].name">Juan Pérez
García</span>, <span data-field="titulares[0].marital_status">casado</span>..."
Parallel Annotation
Pass 3 processes sections in parallel with 6 workers:
with ThreadPoolExecutor(max_workers=6) as executor:
futures = {
executor.submit(annotate_section, 'description', ...): 'description',
executor.submit(annotate_section, 'ownership', ...): 'ownership',
executor.submit(annotate_section, 'origin', ...): 'origin',
# ... etc
}
Annotation Rules
- Field paths must be exact:
titulares[0].name, notowner_name - Arrays use zero-based indexing:
acquisitions[0],acquisitions[1] - No nested spans: Each span is an atomic fact
- Only data-field attribute: No onclick, style, class, etc.
- Match text exactly: Annotated text must be identical to original
Sections Annotated
| Field | Annotated Version |
|---|---|
description_prose | description_prose_annotated |
ownership_prose | ownership_prose_annotated |
origin_prose | origin_prose_annotated |
observations_prose | observations_prose_annotated |
encumbrances[].text | encumbrances[].text_annotated |
Encumbrances Structure
Encumbrances are split by origin:
{
"encumbrances": {
"by_origin": [
{
"text": "Servidumbre a favor de PREPA para líneas eléctricas.",
"text_annotated": "<span data-field=\"servitudes[0].description\">...</span>"
}
],
"direct": [
{
"text": "Hipoteca a favor de Banco Popular, por $150,000...",
"text_annotated": "Hipoteca a favor de <span data-field=\"encumbrances[0].holder\">Banco Popular</span>..."
}
],
"by_origin_free": false,
"direct_free": false
}
}
Error Handling
Pass 3 Failures
If annotation fails for a section:
- Fall back to unannotated prose
- Add review flag for human review
- Continue with other sections
if pass3_annotations is None:
# Use prose as annotated (no spans)
report['description_prose_annotated'] = report.get('description_prose')
review_flags.append("annotation_pass_failed")
Review Flags
The report includes review_flags for human attention:
{
"review_flags": [
{
"issue": "Missing seller information",
"details": "Acquisition #2 has no seller name in source documents"
}
]
}
Output Format
Final report JSON structure:
{
"header": { ... },
"boundaries": { ... },
"description_prose": "...",
"description_prose_annotated": "...<span>...</span>...",
"origin_prose": "...",
"origin_prose_annotated": "...",
"ownership_prose": "...",
"ownership_prose_annotated": "...",
"encumbrances": {
"by_origin": [...],
"direct": [...],
"by_origin_free": false,
"direct_free": false
},
"pending_prose": "...",
"observations_prose": "...",
"observations_prose_annotated": "...",
"additional_sections": [...],
"review_flags": [...]
}
Storage
Reports are stored in:
reportstable: Metadata andreport_jsoncolumn- S3: Full JSON at
s3_uripath - PDF (optional): Generated PDF at
pdf_s3_uri
Performance
| Metric | Value |
|---|---|
| Pass 1+2 parallel time | ~15-20s |
| Pass 3 parallel time | ~10-15s |
| Total generation time | ~30-45s |
| Streaming enabled | Yes (for real-time progress) |
Related Pages
- System Overview - High-level architecture
- Extraction Pipeline - Source data for reports
- Evidence Resolution - Citation validation
- Data Model -
reportstable schema - Observability - Langfuse tracing for report generation