AI Financial Data Analyst with Defense-in-Depth Security

For a fintech company with strict data isolation requirements, I designed and implemented an AI-powered financial data analyst enabling natural language queries over sensitive multi-tenant data. The system combines AWS Bedrock Claude Sonnet 4 with a 5-layer defense-in-depth security architecture, leveraging PostgreSQL Row-Level Security to ensure users can only access data for their authorized business entities while maintaining audit trails and preventing data modification.

The Challenge

Business stakeholders needed self-service access to financial analytics but faced barriers:

SQL expertise gap: Non-technical users unable to write complex analytical queries
Multi-tenant data: Strict requirement for business-level data isolation
Security concerns: AI system must prevent unauthorized data access or modification
Sensitive financial data: Cannot risk data leakage between business entities
Performance: Ad-hoc queries must not impact production databases

They needed:

Natural language interface for non-technical users to query financial data
Multi-tenant security guaranteeing business-level data isolation
Defense-in-depth architecture preventing data modification or unauthorized access
Integration with existing Multi-Source Financial Data Platform with Row-Level Security
Visualization capabilities for generated insights

System Architecture

I designed a 5-layer defense-in-depth security architecture combining database-level controls, application-level validation, and AI-specific safeguards:

Layer 1: PostgreSQL Row-Level Security (RLS)

Leveraging Existing RLS Infrastructure:

The AI agent leverages RLS policies deployed via DBT macros in the data platform:

Permission table: Maps user emails to authorized business entities
RLS-enabled analytical tables: Transaction summaries, balance snapshots, and detailed transaction history
Session-scoped isolation: Each database connection sets a user session variable for filtering
Database-kernel enforcement: RLS policies filter queries at PostgreSQL kernel level

Security Properties:

Even if AI generates malicious SQL, RLS restricts results to authorized businesses
Application compromise cannot bypass RLS (database-enforced)
Session variables are connection-scoped preventing cross-user data leakage

Layer 2: Read-Only Database Role

Dedicated AI Agent Role:

I created a dedicated PostgreSQL role for the AI agent with minimal privileges. The role has explicit SELECT grants only on approved analytical tables, with no INSERT, UPDATE, DELETE, or DDL capabilities. As a non-superuser role without BYPASSRLS privilege, all RLS policies apply automatically.

Security Properties:

Cannot modify data (no write privileges)
Cannot alter schema (no DDL)
Cannot escalate privileges
RLS policies enforced automatically

Layer 3: Read-Only Transaction Enforcement

Connection-Level Read-Only Mode:

I implemented SQLAlchemy connection event handlers that execute on every database connection. The handlers set the transaction to read-only mode and configure the user session variable for RLS filtering. This connection-level enforcement happens transparently before any application code executes.

Security Properties:

Even if SQL injection bypasses application validation, database rejects modifications
Protection against LLM-generated malicious queries
Cannot be disabled by application code (set at connection level)

Layer 4: Query Validation

Application-Level SQL Command Filtering:

I implemented query validation scanning generated SQL for forbidden commands before execution. The validator checks for any write operations (INSERT, UPDATE, DELETE), schema modifications (DROP, CREATE, ALTER, TRUNCATE), or privilege changes (GRANT, REVOKE), rejecting queries containing these commands.

Security Properties:

Fail-fast validation before sending to database
Protects against LLM hallucinations generating modification queries
Provides clear error messages to users
Logs attempted violations for security monitoring

Layer 5: Table Allowlist

Explicit Table Access Control:

I configured an allowlist restricting AI access to explicitly approved analytical tables. The validator parses generated SQL to extract all table references, ensuring every table is on the approved list. Queries attempting to access non-approved tables are rejected before execution.

Security Properties:

Prevents access to internal system tables
Protects sensitive configuration or user management tables
Enables gradual rollout (approve tables incrementally)
Defense against SQL injection attempting to access unauthorized tables

AI Agent Architecture:

LangGraph SQL Agent:

I designed a multi-step reasoning workflow using LangGraph:

graph TD START((START)) --> understand_question[Understand Question] understand_question --> check_schema[Retrieve Schema Info] check_schema --> generate_sql[Generate SQL Query] generate_sql --> validate[Validate Query] validate -->|Invalid| generate_sql validate -->|Valid| execute[Execute Query] execute --> analyze[Analyze Results] analyze --> needs_viz{Needs Visualization?} needs_viz -->|Yes| create_viz[Create Chart] needs_viz -->|No| format_response[Format Response] create_viz --> format_response format_response --> END((END))

Agent Capabilities:

Natural Language Understanding:

AWS Bedrock Claude Sonnet 4 for query interpretation
Handles ambiguous business terminology
Infers date ranges and filtering logic from context

Schema-Aware SQL Generation:

LangChain SQL database toolkit with schema introspection
Generates JOIN queries across multiple tables
Applies business logic (balances, aggregations, running totals)
Handles complex analytical queries (window functions, CTEs)

Result Analysis:

Interprets query results in business context
Generates natural language summaries
Identifies trends and anomalies

Visualization Generation:

Plotly chart generation for numeric results
Supports bar charts, line charts, and tables
Automatic chart type selection based on data characteristics

State Management:

The LangGraph agent maintains state across workflow steps, tracking conversation messages, user identity, generated SQL, query results, visualizations, and error context for debugging.

User Interface:

Chainlit Chat Interface:

Conversational UI: Chat-based interaction with message history
Authentication integration: SSO user email extraction for RLS
Streaming responses: Token-by-token LLM output for responsiveness
Embedded visualizations: Plotly charts rendered inline
Query history: Past queries accessible for reference
Error handling: User-friendly error messages for failed queries

Technical Implementation

AWS Bedrock Integration:

The system uses AWS Bedrock Claude Sonnet 4 via LangChain integration, configured with zero temperature for deterministic SQL generation. This ensures consistent query generation for the same natural language inputs, reducing unpredictability in production.

Database Connection Management:

SQLAlchemy connection pooling with security event listeners
Connection-level enforcement of read-only mode and user session variables
Automatic reconnection with security context preservation
Connection and query timeout limits preventing resource exhaustion

Security Testing:

I implemented comprehensive security testing:

RLS validation: Verify users only see authorized business data
Permission denial tests: Confirm unauthorized table access blocked
Modification attempt tests: Verify INSERT/UPDATE/DELETE rejected
SQL injection tests: Confirm malicious queries caught by validation layers
Cross-tenant tests: Verify user A cannot access user B’s businesses

Key Design Decisions

5-layer defense-in-depth:

No single point of failure - multiple independent security controls
Each layer provides defense even if others fail
RLS as primary control (database-enforced, cannot be bypassed)
Application layers (validation, allowlist) provide fail-fast and observability
Read-only transaction enforced at connection level (transparent to application)

LangGraph for multi-step reasoning:

Enables sophisticated query planning with schema introspection
Supports iterative SQL refinement on validation failures
Maintains conversation context for follow-up questions
Separates concerns (understanding, generation, validation, execution)

AWS Bedrock Claude Sonnet 4:

Strong reasoning capabilities for complex analytical queries
Excellent SQL generation quality reducing validation failures
Supports long context for schema documentation
Temperature 0 for deterministic query generation

Session-scoped RLS enforcement:

User email set once per connection at connection time
Session variables connection-scoped (no cross-user leakage)
Leverages existing RLS infrastructure from data platform
Transparent to application code (database-enforced)

Results & Impact

Deployed AI-powered financial data analyst enabling natural language queries for non-technical users
Implemented 5-layer defense-in-depth security architecture guaranteeing multi-tenant data isolation
Leveraged existing Multi-Source Financial Data Platform with Row-Level Security.
Built LangGraph agent with schema-aware SQL generation handling complex analytical queries
Integrated Chainlit chat interface with SSO authentication and embedded visualizations
Achieved zero data breaches or cross-tenant access incidents through layered security controls
Enabled self-service analytics reducing manual reporting burden

Technologies

Python, LangGraph, LangChain, AWS Bedrock, Anthropic, Claude Sonnet 4, PostgreSQL, Row-Level Security (RLS), SQLAlchemy, Chainlit, Docker, Pydantic