π System Data Flow
Understanding how data moves through the Azure Governance Platform is critical for troubleshooting, optimization, and scaling.
High-Level Data Flow
External Sources β Platform β Data Layer β Consumers
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β EXTERNAL SOURCES β
β ββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Azure β β Azure Cost β β Azure AD β β
β β ARM API β β Management β β B2C β β
β ββββββββββββ ββββββββββββββββ ββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AZURE GOVERNANCE PLATFORM β
β ββββββββββββββ ββββββββββββ ββββββββββββ βββββββββ β
β β FastAPI β β Service β β Redis β βWorkersβ β
β β App β β Layer β β Cache β β β β
β ββββββββββββββ ββββββββββββ ββββββββββββ βββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DATA LAYER β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β Azure β β Blob β β Key β β
β β SQL β β Storage β β Vault β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DATA CONSUMERS β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β React β β Azure β β Alert β β
β β Web App β β Workbooksβ β System β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Request Lifecycle
9-Step API Request Flow
Average Response Time: ~130ms (p95: ~532ms)
Request Ingress
Client sends HTTP request to Azure Front Door β App Service
Authentication & Validation
FastAPI validates JWT (Azure AD B2C), checks rate limits
Tenant Resolution
Extract tenant_id from JWT, set SQL context for RLS
Cache Lookup
Check Redis cache for data (15-min TTL)
Database Query
Query Azure SQL with tenant filters (RLS applied)
Business Logic
Service layer processes, transforms, enriches data
Response Serialization
Pydantic models serialize to JSON
Telemetry & Logging
App Insights captures metrics and trace data
Response Delivery
JSON response with security headers
Azure Integration Flows
Resource Discovery Flow
Frequency: Every 6 hours per tenant
Duration: ~2-5 minutes per tenant
Sync Worker β Azure ARM API β Resource Parser β Azure SQL β Redis Cache
Process:
- Sync Worker authenticates with Managed Identity
- Queries Azure ARM API for resource list
- For each resource: get details, normalize, enrich
- Upsert to Azure SQL (with tenant_id)
- Invalidate Redis cache
- Update last sync timestamp
Cost Data Ingestion Flow
Frequency: Daily at 2 AM UTC
Retention: 13 months
Job Scheduler β Cost Management API β Data Transformer β Azure SQL β App Insights
Process:
- Scheduler requests usage data (daily)
- API returns CSV/JSON usage details
- Data transformer parses and maps to internal schema
- Bulk insert to Azure SQL (partitioned by tenant)
- Log metrics to Application Insights
Data Connections Matrix
Internal Connections
| Source | Destination | Protocol | Purpose | Volume |
|---|---|---|---|---|
| App Service | Azure SQL | ODBC + SSL | Data queries | ~500 req/min |
| App Service | Redis | Redis | Caching | ~1000 ops/min |
| App Service | Key Vault | HTTPS | Secrets | ~10 req/min |
| Workers | Azure SQL | ODBC + SSL | Data writes | ~200 writes/min |
External Connections
| Source | Destination | Protocol | Purpose |
|---|---|---|---|
| Platform | Azure ARM | HTTPS | Resource discovery |
| Platform | Cost Mgmt | HTTPS | Cost data |
| Platform | Azure AD B2C | HTTPS | Authentication |
| Platform | App Insights | HTTPS | Telemetry |
Data Storage Architecture
Azure SQL Database
Key Tables:
tenants- Tenant configuration (50 rows)resources- Resource inventory (~2,500 rows)cost_data- Daily cost records (~15,000 rows)compliance_scores- Assessments (~500 rows)users- User identities (~200 rows)audit_logs- Audit trail (~50,000 rows)
Redis Cache
Cache Patterns:
session:{user_id}β User session (8h TTL)resources:{tenant_id}β Resource list (15m TTL)costs:{tenant_id}:{month}β Cost summary (1h TTL)health:{tenant_id}β Health status (5m TTL)
Hit Rate: ~85%
Blob Storage
Containers:
reports/β Generated PDFsexports/β CSV exportsbackups/β Daily DB backupslogs/β Application logs
Total Size: ~12GB
Performance Characteristics
Query Performance
| Query Type | Average | p95 | Cache Hit |
|---|---|---|---|
| Resource List | 45ms | 120ms | 85% |
| Cost Summary | 30ms | 80ms | 90% |
| Compliance | 25ms | 60ms | 95% |
| User Lookup | 15ms | 40ms | 80% |
Throughput
| Metric | Current | Capacity |
|---|---|---|
| Requests/Second | ~15 | ~150 |
| Concurrent Users | ~10 | ~100 |
| Data Ingestion | ~1MB/hour | ~10MB/hour |
Data Security in Transit
Encryption
| Connection | Protocol | Cipher |
|---|---|---|
| Client β App | HTTPS 1.3 | TLS_AES_256_GCM_SHA384 |
| App β SQL | ODBC + SSL | AES-256 |
| App β Redis | Redis + TLS | AES-256 |
| App β Key Vault | HTTPS | TLS 1.3 |
Network Security
- β Private Endpoints: SQL, Key Vault, Storage
- β Firewall Rules: Only App Service IPs allowed
- β VNet Integration: App Service in isolated VNet
Troubleshooting Data Flow
Common Issues
Slow resource list loading:
- Check Redis cache hit rate (target: >80%)
- Review SQL query execution plan
- Verify connection pool not exhausted
Stale cost data:
- Check sync job last run time
- Verify Azure Cost Management API access
- Review Service Bus queue depth
High database CPU:
- Identify expensive queries in Query Store
- Check for missing indexes
- Review sync job timing
Data Flow v1.8.1 | Understanding the System