AI-Powered Technical Report Generation with Multi-Source Document Analysis
As technical lead and AI architect for a civil engineering firm, I designed and built an end-to-end technical report generation system combining multi-source document ingestion, LangGraph-based AI agents with structured output, and human-in-the-loop verification. Leading an interdisciplinary team of frontend, backend, and product professionals in a client-facing role, I architected both the AI pipeline and full-stack infrastructure to automate report creation from specialized document types while maintaining quality control through integrated review workflows.
The Challenge
A civil engineering firm conducting site assessments needed to process diverse document types and synthesize them into comprehensive technical reports. The manual process was:
- Time-intensive with engineers spending days compiling reports from multiple sources
- Inconsistent in structure and detail across different report authors
- Difficult to maintain quality control with hundreds of data points per report
- Fragmented with no centralized system for document storage and report versioning
- Error-prone due to manual data entry and cross-referencing between sources
They needed:
- Automated document parsing for specialized document formats
- AI-powered report section generation using parsed data as context
- Human verification points throughout ingestion and generation workflows
- Structured data extraction from unstructured documents
- Full-stack web application for document management and report editing
- Word document export maintaining professional formatting
System Architecture
I designed a three-phase architecture separating ingestion, generation, and export with human verification integrated throughout:
Document Ingestion Pipeline:
Multi-Format Document Parsing:
We designed specialized parsers handling diverse data sources across four categories:
- Historical Data: Temporal records, archival documents, historical imagery with year-over-year change detection
- Spatial Data: Geographic information, topographic features, land use mapping with coordinate extraction
- Regulatory Data: External database records, agency filings, compliance documents with structured field extraction
- Field Observations: Site inspection checklists, assessment questionnaires, observation logs with standardized formatting
Each parser implements format-specific extraction logic (PDF tables, image metadata, structured forms, unstructured text) while maintaining unified data schema for downstream processing
Background Processing Architecture:
- Dramatiq task queue with Redis broker for async document processing
- Category-specific postprocessing pipelines triggered after parsing
- Configurable retry logic
- File status tracking (Pending → Processing → Done/Failed) in PostgreSQL
- Structured logging
Data Storage Strategy:
- PostgreSQL: Report metadata, file tracking, user sessions, relationships, extracted findings, AI-generated sections with versioning
- MongoDB: Parsed document content
- Separation enables fast metadata queries while storing flexible document structures
AI-Powered Report Generation:
LangGraph Multi-Agent Architecture:
I implemented section-specific generation strategies ranging from simple to complex:
Simple Section Agents:
- Fetch context from databases via specialized loaders
- Retrieve prompts dynamically from Langfuse
- Single LLM invocation with GPT
- Return generated content with provenance tracking (file IDs, page numbers)
- Handle user feedback for iterative regeneration
Complex Section Agents (LangGraph StateGraph):
For complex sections, I designed a parallel extraction workflow:
-
Parallel Data Extraction Nodes:
- Multiple extraction nodes running concurrently, each processing a different data source category
- Each node receives raw documents and extracts structured findings using specialized prompts
- Structured output enforced via Pydantic models preventing hallucination
-
Structured Output via Pydantic Models: Each finding type defined with Pydantic schemas enforcing:
- Required fields (source, location, time_period, category, detail)
- Field descriptions guiding LLM output
- Type validation preventing malformed data
- Nested lists for multiple findings per document
-
Section Writer Node:
- Receives all extracted findings as context from parallel extraction branches
- Retrieves section-specific requirements from prompt templates
- Generates cohesive narrative synthesizing all data sources
- Maintains factual accuracy by referencing structured findings
Graph Structure:
Dynamic Prompt Management:
- Langfuse integration enabling product managers to edit prompts via web UI
- Automatic prompt versioning and rollback capability
- Fallback to local prompts if Langfuse unavailable
- Migration system deploying prompts across all report sections
Provenance Tracking:
- Every generated section linked to source file ID and page number
- Frontend displays provenance for audit trail and verification
- Extracted from MongoDB document metadata during context loading
Human-in-the-Loop Verification:
Frontend:
- Report Dashboard: Upload files, view status, manage report lifecycle
- File Management: Category assignment, parsing status, content summaries
- Findings Review: Tabular display of extracted findings grouped by classification
- Section Editor: Rich text editing for AI-generated content refinement
- Assessment Questionnaire: Structured data collection complementing automated parsing
Verification Points:
- File Upload: User assigns document category guiding parser selection
- Post-Parse Review: User validates extracted findings and content summaries
- Pre-Generation Review: User confirms context data before section generation
- Post-Generation Edit: User refines AI-generated prose in rich text editor
- Iterative Refinement: User provides feedback triggering regeneration with feedback context
Export & Deployment:
Word Document Export:
- Template engine maintaining professional report format
- Programmatic generation of tables, headings, and formatted text
Infrastructure:
- Docker containerization for backend, frontend, databases, monitoring
- GitHub Actions CI/CD deploying to GitHub Container Registry
- OAuth authentication (Google + Microsoft)
- Monitoring: Phoenix and Langfuse (AI observability), Grafana (metrics), Prometheus
Technical Leadership & Architecture Decisions
System Architecture Design:
- Designed full-stack architecture: FastAPI backend, frontend, multi-database strategy
- Chose document-oriented (MongoDB) vs relational (PostgreSQL) storage based on data characteristics
- Architected background job system with Dramatiq for scalable async processing
- Designed category-specific parsing + postprocessing plugin architecture for extensibility
AI Architecture Design:
- Evaluated LangGraph vs simple LLM calls, chose hybrid approach based on section complexity
- Implemented structured output strategy using Pydantic models to prevent hallucination
- Designed parallel extraction pattern for multi-source data integration
- Integrated Langfuse for prompt management enabling non-technical stakeholders to refine AI behavior
Team Leadership:
- Led interdisciplinary team of frontend, backend developers, and product owners
- Client-facing technical role supporting product owners in stakeholder communication and translating requirements into architecture
- Established development workflows, code review processes, and testing standards
- Bridged technical implementation and business requirements to align AI capabilities with user needs
Key Decisions:
- Langfuse for prompts: Enabled rapid iteration without code deployments
- Pydantic structured output: Eliminated JSON parsing errors and hallucinated fields
- Dual-database strategy: MongoDB for flexible document storage, PostgreSQL for structured metadata and relationships
- LangGraph for complex sections: Enabled sophisticated multi-step reasoning with state management
- Human-in-the-loop workflow: Balanced automation with quality control for high-stakes reports
Results & Impact
- Deployed production system automating technical report generation for civil engineering firm
- Reduced report compilation time from days to hours through automated parsing and generation
- Architected specialized parser system handling diverse document formats
- Built LangGraph agents with structured output ensuring consistent, high-quality report sections
- Enabled non-technical users to manage section-specific AI prompts via Langfuse web interface
- Established human verification workflow maintaining quality control while accelerating delivery
- Architected scalable infrastructure supporting concurrent document processing and report generation
- Led cross-functional team delivering full-stack application from requirements to deployment
Technologies
Python, FastAPI, LangChain, LangGraph, OpenAI, GPT, Pydantic, Dramatiq, PostgreSQL, MongoDB, Redis, Langfuse, OpenTelemetry, Phoenix, Grafana, Prometheus, Docker, OAuth