🌊 System Data Flow

Understanding how data moves through the Azure Governance Platform is critical for troubleshooting, optimization, and scaling.


High-Level Data Flow

External Sources β†’ Platform β†’ Data Layer β†’ Consumers

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   EXTERNAL SOURCES                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚  β”‚ Azure    β”‚  β”‚ Azure Cost   β”‚  β”‚ Azure AD     β”‚      β”‚
β”‚  β”‚ ARM API  β”‚  β”‚ Management   β”‚  β”‚ B2C          β”‚      β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
                          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              AZURE GOVERNANCE PLATFORM                    β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚ FastAPI    β”‚  β”‚ Service  β”‚  β”‚ Redis    β”‚  β”‚Workersβ”‚ β”‚
β”‚  β”‚ App        β”‚  β”‚ Layer    β”‚  β”‚ Cache    β”‚  β”‚       β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
                          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    DATA LAYER                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”               β”‚
β”‚  β”‚ Azure    β”‚  β”‚ Blob     β”‚  β”‚ Key      β”‚               β”‚
β”‚  β”‚ SQL      β”‚  β”‚ Storage  β”‚  β”‚ Vault    β”‚               β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
                          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  DATA CONSUMERS                           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”‚
β”‚  β”‚ React    β”‚  β”‚ Azure    β”‚  β”‚ Alert    β”‚              β”‚
β”‚  β”‚ Web App  β”‚  β”‚ Workbooksβ”‚  β”‚ System   β”‚              β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Request Lifecycle

9-Step API Request Flow

Average Response Time: ~130ms (p95: ~532ms)

1

Request Ingress

Client sends HTTP request to Azure Front Door β†’ App Service

↓
2

Authentication & Validation

FastAPI validates JWT (Azure AD B2C), checks rate limits

↓
3

Tenant Resolution

Extract tenant_id from JWT, set SQL context for RLS

↓
4

Cache Lookup

Check Redis cache for data (15-min TTL)

↓
5

Database Query

Query Azure SQL with tenant filters (RLS applied)

↓
6

Business Logic

Service layer processes, transforms, enriches data

↓
7

Response Serialization

Pydantic models serialize to JSON

↓
8

Telemetry & Logging

App Insights captures metrics and trace data

↓
9

Response Delivery

JSON response with security headers


Azure Integration Flows

Resource Discovery Flow

Frequency: Every 6 hours per tenant
Duration: ~2-5 minutes per tenant

Sync Worker β†’ Azure ARM API β†’ Resource Parser β†’ Azure SQL β†’ Redis Cache

Process:

  1. Sync Worker authenticates with Managed Identity
  2. Queries Azure ARM API for resource list
  3. For each resource: get details, normalize, enrich
  4. Upsert to Azure SQL (with tenant_id)
  5. Invalidate Redis cache
  6. Update last sync timestamp

Cost Data Ingestion Flow

Frequency: Daily at 2 AM UTC
Retention: 13 months

Job Scheduler β†’ Cost Management API β†’ Data Transformer β†’ Azure SQL β†’ App Insights

Process:

  1. Scheduler requests usage data (daily)
  2. API returns CSV/JSON usage details
  3. Data transformer parses and maps to internal schema
  4. Bulk insert to Azure SQL (partitioned by tenant)
  5. Log metrics to Application Insights

Data Connections Matrix

Internal Connections

Source Destination Protocol Purpose Volume
App Service Azure SQL ODBC + SSL Data queries ~500 req/min
App Service Redis Redis Caching ~1000 ops/min
App Service Key Vault HTTPS Secrets ~10 req/min
Workers Azure SQL ODBC + SSL Data writes ~200 writes/min

External Connections

Source Destination Protocol Purpose
Platform Azure ARM HTTPS Resource discovery
Platform Cost Mgmt HTTPS Cost data
Platform Azure AD B2C HTTPS Authentication
Platform App Insights HTTPS Telemetry

Data Storage Architecture

Azure SQL Database

Key Tables:

  • tenants - Tenant configuration (50 rows)
  • resources - Resource inventory (~2,500 rows)
  • cost_data - Daily cost records (~15,000 rows)
  • compliance_scores - Assessments (~500 rows)
  • users - User identities (~200 rows)
  • audit_logs - Audit trail (~50,000 rows)

Redis Cache

Cache Patterns:

  • session:{user_id} β†’ User session (8h TTL)
  • resources:{tenant_id} β†’ Resource list (15m TTL)
  • costs:{tenant_id}:{month} β†’ Cost summary (1h TTL)
  • health:{tenant_id} β†’ Health status (5m TTL)

Hit Rate: ~85%

Blob Storage

Containers:

  • reports/ β†’ Generated PDFs
  • exports/ β†’ CSV exports
  • backups/ β†’ Daily DB backups
  • logs/ β†’ Application logs

Total Size: ~12GB


Performance Characteristics

Query Performance

Query Type Average p95 Cache Hit
Resource List 45ms 120ms 85%
Cost Summary 30ms 80ms 90%
Compliance 25ms 60ms 95%
User Lookup 15ms 40ms 80%

Throughput

Metric Current Capacity
Requests/Second ~15 ~150
Concurrent Users ~10 ~100
Data Ingestion ~1MB/hour ~10MB/hour

Data Security in Transit

Encryption

Connection Protocol Cipher
Client β†’ App HTTPS 1.3 TLS_AES_256_GCM_SHA384
App β†’ SQL ODBC + SSL AES-256
App β†’ Redis Redis + TLS AES-256
App β†’ Key Vault HTTPS TLS 1.3

Network Security

  • βœ… Private Endpoints: SQL, Key Vault, Storage
  • βœ… Firewall Rules: Only App Service IPs allowed
  • βœ… VNet Integration: App Service in isolated VNet

Troubleshooting Data Flow

Common Issues

Slow resource list loading:

  • Check Redis cache hit rate (target: >80%)
  • Review SQL query execution plan
  • Verify connection pool not exhausted

Stale cost data:

  • Check sync job last run time
  • Verify Azure Cost Management API access
  • Review Service Bus queue depth

High database CPU:

  • Identify expensive queries in Query Store
  • Check for missing indexes
  • Review sync job timing

Data Flow v1.8.1 | Understanding the System