AI Financial Data Analyst with Defense-in-Depth Security
For a fintech company with strict data isolation requirements, I designed and implemented an AI-powered financial data analyst enabling natural language queries over sensitive multi-tenant data. The system combines AWS Bedrock Claude Sonnet 4 with a 5-layer defense-in-depth security architecture, leveraging PostgreSQL Row-Level Security to ensure users can only access data for their authorized business entities while maintaining audit trails and preventing data modification.
The Challenge
Business stakeholders needed self-service access to financial analytics but faced barriers:
- SQL expertise gap: Non-technical users unable to write complex analytical queries
- Multi-tenant data: Strict requirement for business-level data isolation
- Security concerns: AI system must prevent unauthorized data access or modification
- Sensitive financial data: Cannot risk data leakage between business entities
- Performance: Ad-hoc queries must not impact production databases
They needed:
- Natural language interface for non-technical users to query financial data
- Multi-tenant security guaranteeing business-level data isolation
- Defense-in-depth architecture preventing data modification or unauthorized access
- Integration with existing Multi-Source Financial Data Platform with Row-Level Security
- Visualization capabilities for generated insights
System Architecture
I designed a 5-layer defense-in-depth security architecture combining database-level controls, application-level validation, and AI-specific safeguards:
Layer 1: PostgreSQL Row-Level Security (RLS)
Leveraging Existing RLS Infrastructure:
The AI agent leverages RLS policies deployed via DBT macros in the data platform:
- Permission table: Maps user emails to authorized business entities
- RLS-enabled analytical tables: Transaction summaries, balance snapshots, and detailed transaction history
- Session-scoped isolation: Each database connection sets a user session variable for filtering
- Database-kernel enforcement: RLS policies filter queries at PostgreSQL kernel level
Security Properties:
- Even if AI generates malicious SQL, RLS restricts results to authorized businesses
- Application compromise cannot bypass RLS (database-enforced)
- Session variables are connection-scoped preventing cross-user data leakage
Layer 2: Read-Only Database Role
Dedicated AI Agent Role:
I created a dedicated PostgreSQL role for the AI agent with minimal privileges. The role has explicit SELECT grants only on approved analytical tables, with no INSERT, UPDATE, DELETE, or DDL capabilities. As a non-superuser role without BYPASSRLS privilege, all RLS policies apply automatically.
Security Properties:
- Cannot modify data (no write privileges)
- Cannot alter schema (no DDL)
- Cannot escalate privileges
- RLS policies enforced automatically
Layer 3: Read-Only Transaction Enforcement
Connection-Level Read-Only Mode:
I implemented SQLAlchemy connection event handlers that execute on every database connection. The handlers set the transaction to read-only mode and configure the user session variable for RLS filtering. This connection-level enforcement happens transparently before any application code executes.
Security Properties:
- Even if SQL injection bypasses application validation, database rejects modifications
- Protection against LLM-generated malicious queries
- Cannot be disabled by application code (set at connection level)
Layer 4: Query Validation
Application-Level SQL Command Filtering:
I implemented query validation scanning generated SQL for forbidden commands before execution. The validator checks for any write operations (INSERT, UPDATE, DELETE), schema modifications (DROP, CREATE, ALTER, TRUNCATE), or privilege changes (GRANT, REVOKE), rejecting queries containing these commands.
Security Properties:
- Fail-fast validation before sending to database
- Protects against LLM hallucinations generating modification queries
- Provides clear error messages to users
- Logs attempted violations for security monitoring
Layer 5: Table Allowlist
Explicit Table Access Control:
I configured an allowlist restricting AI access to explicitly approved analytical tables. The validator parses generated SQL to extract all table references, ensuring every table is on the approved list. Queries attempting to access non-approved tables are rejected before execution.
Security Properties:
- Prevents access to internal system tables
- Protects sensitive configuration or user management tables
- Enables gradual rollout (approve tables incrementally)
- Defense against SQL injection attempting to access unauthorized tables
AI Agent Architecture:
LangGraph SQL Agent:
I designed a multi-step reasoning workflow using LangGraph:
Agent Capabilities:
Natural Language Understanding:
- AWS Bedrock Claude Sonnet 4 for query interpretation
- Handles ambiguous business terminology
- Infers date ranges and filtering logic from context
Schema-Aware SQL Generation:
- LangChain SQL database toolkit with schema introspection
- Generates JOIN queries across multiple tables
- Applies business logic (balances, aggregations, running totals)
- Handles complex analytical queries (window functions, CTEs)
Result Analysis:
- Interprets query results in business context
- Generates natural language summaries
- Identifies trends and anomalies
Visualization Generation:
- Plotly chart generation for numeric results
- Supports bar charts, line charts, and tables
- Automatic chart type selection based on data characteristics
State Management:
The LangGraph agent maintains state across workflow steps, tracking conversation messages, user identity, generated SQL, query results, visualizations, and error context for debugging.
User Interface:
Chainlit Chat Interface:
- Conversational UI: Chat-based interaction with message history
- Authentication integration: SSO user email extraction for RLS
- Streaming responses: Token-by-token LLM output for responsiveness
- Embedded visualizations: Plotly charts rendered inline
- Query history: Past queries accessible for reference
- Error handling: User-friendly error messages for failed queries
Technical Implementation
AWS Bedrock Integration:
The system uses AWS Bedrock Claude Sonnet 4 via LangChain integration, configured with zero temperature for deterministic SQL generation. This ensures consistent query generation for the same natural language inputs, reducing unpredictability in production.
Database Connection Management:
- SQLAlchemy connection pooling with security event listeners
- Connection-level enforcement of read-only mode and user session variables
- Automatic reconnection with security context preservation
- Connection and query timeout limits preventing resource exhaustion
Security Testing:
I implemented comprehensive security testing:
- RLS validation: Verify users only see authorized business data
- Permission denial tests: Confirm unauthorized table access blocked
- Modification attempt tests: Verify INSERT/UPDATE/DELETE rejected
- SQL injection tests: Confirm malicious queries caught by validation layers
- Cross-tenant tests: Verify user A cannot access user B’s businesses
Key Design Decisions
5-layer defense-in-depth:
- No single point of failure - multiple independent security controls
- Each layer provides defense even if others fail
- RLS as primary control (database-enforced, cannot be bypassed)
- Application layers (validation, allowlist) provide fail-fast and observability
- Read-only transaction enforced at connection level (transparent to application)
LangGraph for multi-step reasoning:
- Enables sophisticated query planning with schema introspection
- Supports iterative SQL refinement on validation failures
- Maintains conversation context for follow-up questions
- Separates concerns (understanding, generation, validation, execution)
AWS Bedrock Claude Sonnet 4:
- Strong reasoning capabilities for complex analytical queries
- Excellent SQL generation quality reducing validation failures
- Supports long context for schema documentation
- Temperature 0 for deterministic query generation
Session-scoped RLS enforcement:
- User email set once per connection at connection time
- Session variables connection-scoped (no cross-user leakage)
- Leverages existing RLS infrastructure from data platform
- Transparent to application code (database-enforced)
Results & Impact
- Deployed AI-powered financial data analyst enabling natural language queries for non-technical users
- Implemented 5-layer defense-in-depth security architecture guaranteeing multi-tenant data isolation
- Leveraged existing Multi-Source Financial Data Platform with Row-Level Security.
- Built LangGraph agent with schema-aware SQL generation handling complex analytical queries
- Integrated Chainlit chat interface with SSO authentication and embedded visualizations
- Achieved zero data breaches or cross-tenant access incidents through layered security controls
- Enabled self-service analytics reducing manual reporting burden
Technologies
Python, LangGraph, LangChain, AWS Bedrock, Anthropic, Claude Sonnet 4, PostgreSQL, Row-Level Security (RLS), SQLAlchemy, Chainlit, Docker, Pydantic