# Cost Service Per-vehicle cost estimation service for capacity planning and cloud vs on-prem comparison. ## Overview This service estimates the cost of running cloud services per VIN by: 1. Querying vehicle activity from ClickHouse (message counts) 2. Estimating resource usage based on activity level 3. Applying cost rates for cloud vs on-prem hosting 4. Storing aggregated cost data for reporting ## Cost Estimation Methodology ### Cost Model The cost model separates fixed platform costs from variable per-VIN costs: ``` Total Cost = Platform Base Cost + (Per-VIN Cost × Number of VINs) + Managed Services ``` Whether you have 100 vehicles or 100,000, you still need Kafka, databases, and gateway services running. That's your platform base cost. Then each additional vehicle adds a small marginal cost on top. ### Platform Base Resources (Fixed) What it takes to run the cloud services platform (from migration plan v1.1): | Component | CPU (cores) | Memory (GB) | Notes | |-----------|-------------|-------------|-------| | VM 1-4: Core Services | 16 | 64 | Kafka, OTA, Valet, Auth/APIs | | VM 5: Analytics Primary | 32 | 256 | ClickHouse, Ditto, Beacon, Jetfire | | VM 6: Analytics Secondary | 32 | 256 | Optimus, Cargo, Vehicle Analytics | | **Total Platform Base** | **80** | **544** | Based on migration plan | ### Per-VIN Resources (Marginal) Incremental resources needed for each additional connected vehicle: | Activity Level | Messages/15min | CPU (millicores) | Memory (MB) | |---------------|----------------|------------------|-------------| | Low | < 100 | 3 | 10 | | Medium | 100-1000 | 5 | 15 | | High | > 1000 | 6 | 20 | **Derivation (from cec-prd-cluster-1 metrics, 2026-02-05):** - Event Hubs: ~90M messages/hour, ~230GB/hour incoming - Active VINs: ~3,760 - AKS cluster usage: ~43 cores / ~219GB RAM - Estimated platform base: ~34 cores / ~175GB (ClickHouse, Kafka consumers, services) - Per-VIN marginal: (43-34)/3760 ≈ 2.4mc, (219-175)/3760 ≈ 12MB - Rounded to 3mc / 10MB for conservative estimate ### Cost Rates | Resource | Cloud (Azure) | On-Prem/Bare Metal | |----------|---------------|-------------------| | CPU/core-hour | $0.35 | $0.015 | | Memory/GB-hour | $0.10 | $0.004 | | Managed Services/15min | $17.00 | $1.50 | #### Why On-Prem is ~90-95% Cheaper - **Platform base**: Same workload, but cloud charges ~20x more for managed services - **Per-VIN compute**: Cloud VMs cost ~20x more than amortized bare metal - **Managed services**: Event Hubs, CosmosDB, Azure DB for PostgreSQL have significant markup vs self-hosted ### Savings Calculation ``` Platform Base Cost = (80 cores × rate + 544 GB × rate) × hours Per-VIN Cost = (0.003 cores × rate + 0.01 GB × rate) × hours × activity_multiplier Total Cost = Platform Base + (Per-VIN × VIN count) + Managed Services Cloud Cost = Total with cloud rates On-Prem Cost = Total with on-prem rates Savings = Cloud Cost - On-Prem Cost Savings % = (Savings / Cloud Cost) × 100 ``` Expected savings: **~90-95%** with on-prem/bare metal hosting. ### Projected Annual Costs Based on ~$1M/year Azure spend (migration plan v1.1): | Metric | Cloud | On-Prem | |--------|-------|---------| | Monthly Cost | ~$83,000 | ~$7,000 | | Annual Cost | ~$1,000,000 | ~$84,000 | | Annual Savings | ~$916,000 (92%) | - | ## API Endpoints ### GET /cost/vin/{vin} Cost summary for a specific VIN. ### GET /cost/fleet Fleet-wide cost summary with top cost VINs. ### GET /cost/summary?period=day|week|month High-level cost summary for a time period. ### GET /cost/comparison Cloud vs on-prem cost comparison with projected annual savings. ### GET /cost/report Plain text report for terminal viewing. ## Accessing the Report The service is deployed internally on cec-prd-cluster-1 (no public ingress). To view the report: ```bash # Quick one-liner kubectl --context cec-prd-cluster-1 run curl-test --image=curlimages/curl --rm -it --restart=Never -- curl -s http://cost.default.svc.cluster.local:8077/cost/report # Or port-forward and curl locally kubectl --context cec-prd-cluster-1 port-forward svc/cost 8077:8077 & curl http://localhost:8077/cost/report ``` ## Example Report Output ``` ╔══════════════════════════════════════════════════════════════════╗ ║ COST SERVICE REPORT ║ ╠══════════════════════════════════════════════════════════════════╣ ║ Period: 2026-01-05 to 2026-02-05 ╠══════════════════════════════════════════════════════════════════╣ ║ FLEET OVERVIEW ║ ║ ─────────────────────────────────────────────────────────────── ║ ║ Active Vehicles: 2463 ║ Cloud Cost: $695.52 ║ On-Prem Cost: $62.55 ║ Savings: $632.97 (91.0%) ╠══════════════════════════════════════════════════════════════════╣ ║ RESOURCE USAGE MODEL ║ ║ ─────────────────────────────────────────────────────────────── ║ ║ Platform Base: 80 cores / 544 GB RAM (fixed) ║ Per-VIN Marginal: 3 millicores / 10 MB RAM ║ Total Fleet: 87.4 cores / 568.6 GB RAM ╠══════════════════════════════════════════════════════════════════╣ ║ COST FORMULA ║ ║ ─────────────────────────────────────────────────────────────── ║ ║ (Platform Base) + (Per-VIN × 2463 VINs) + Managed Services ╠══════════════════════════════════════════════════════════════════╣ ║ COST RATES ║ ║ ─────────────────────────────────────────────────────────────── ║ ║ Cloud: CPU $0.35/core-hr Memory $0.100/GB-hr ║ On-Prem: CPU $0.01/core-hr Memory $0.004/GB-hr ║ Base Infra: Cloud $17.00/15min On-Prem $1.50/15min ╠══════════════════════════════════════════════════════════════════╣ ║ ANNUAL PROJECTION (based on current usage) ║ ║ ─────────────────────────────────────────────────────────────── ║ ║ Cloud Annual: $8346.25 ║ On-Prem Annual: $750.61 ║ Annual Savings: $7595.65 ╚══════════════════════════════════════════════════════════════════╝ TOP COST VEHICLES: VIN CPU (mc) RAM (MB) Cloud $ On-Prem $ Savings % ─────────────────── ──────── ──────── ────────── ────────── ──────── VCF1UBU21RG013084 94 149 1.20 0.12 90.4% VCF1EBU21RG012448 94 149 1.20 0.12 90.4% VCF1EBU22PG011385 94 149 1.20 0.12 90.4% VCF1EBU24PG011467 94 149 1.20 0.12 90.4% VCF1EBU29PG007298 94 149 1.20 0.12 90.4% ... ``` *Note: Report generated 2026-02-05. Costs accumulate over time as the collector runs every 15 minutes.* ## Configuration | Env Var | Description | Default | |---------|-------------|---------| | CLICKHOUSE_HOST | Local CH for storing cost data | localhost | | REMOTE_CLICKHOUSE_HOST | Dev cluster CH for VIN activity | - | | COLLECTOR_INTERVAL_MINUTES | How often to collect metrics | 15 | ## Limitations - Resource estimates are approximations, not actual measurements - Cost rates are simplified and don't reflect all real-world factors - On-prem costs exclude significant operational overhead - Designed for business case illustration, not precise billing