- Platform base: 80 cores / 544GB RAM (from migration plan) - Cloud rates: $0.35/core-hr, $0.10/GB-hr, $17/15min managed services - On-prem rates: $0.015/core-hr, $0.004/GB-hr, $1.50/15min - Based on ~$1M/year Azure spend, ~92% savings with on-prem - Updated README with migration plan references
9.2 KiB
Cost Service
Per-vehicle cost estimation service for capacity planning and cloud vs on-prem comparison.
Overview
This service estimates the cost of running cloud services per VIN by:
- Querying vehicle activity from ClickHouse (message counts)
- Estimating resource usage based on activity level
- Applying cost rates for cloud vs on-prem hosting
- Storing aggregated cost data for reporting
Cost Estimation Methodology
Cost Model
The cost model separates fixed platform costs from variable per-VIN costs:
Total Cost = Platform Base Cost + (Per-VIN Cost × Number of VINs) + Managed Services
Whether you have 100 vehicles or 100,000, you still need Kafka, databases, and gateway services running. That's your platform base cost. Then each additional vehicle adds a small marginal cost on top.
Platform Base Resources (Fixed)
What it takes to run the cloud services platform (from migration plan v1.1):
| Component | CPU (cores) | Memory (GB) | Notes |
|---|---|---|---|
| VM 1-4: Core Services | 16 | 64 | Kafka, OTA, Valet, Auth/APIs |
| VM 5: Analytics Primary | 32 | 256 | ClickHouse, Ditto, Beacon, Jetfire |
| VM 6: Analytics Secondary | 32 | 256 | Optimus, Cargo, Vehicle Analytics |
| Total Platform Base | 80 | 544 | Based on migration plan |
Per-VIN Resources (Marginal)
Incremental resources needed for each additional connected vehicle:
| Activity Level | Messages/15min | CPU (millicores) | Memory (MB) |
|---|---|---|---|
| Low | < 100 | 50 | 80 |
| Medium | 100-1000 | 75 | 120 |
| High | > 1000 | 100 | 160 |
Cost Rates
| Resource | Cloud (Azure) | On-Prem/Bare Metal |
|---|---|---|
| CPU/core-hour | $0.35 | $0.015 |
| Memory/GB-hour | $0.10 | $0.004 |
| Managed Services/15min | $17.00 | $1.50 |
Why On-Prem is ~90-95% Cheaper
- Platform base: Same workload, but cloud charges ~20x more for managed services
- Per-VIN compute: Cloud VMs cost ~20x more than amortized bare metal
- Managed services: Event Hubs, CosmosDB, Azure DB for PostgreSQL have significant markup vs self-hosted
Savings Calculation
Platform Base Cost = (80 cores × rate + 544 GB × rate) × hours
Per-VIN Cost = (0.05 cores × rate + 0.08 GB × rate) × hours × activity_multiplier
Total Cost = Platform Base + (Per-VIN × VIN count) + Managed Services
Cloud Cost = Total with cloud rates
On-Prem Cost = Total with on-prem rates
Savings = Cloud Cost - On-Prem Cost
Savings % = (Savings / Cloud Cost) × 100
Expected savings: ~90-95% with on-prem/bare metal hosting.
Projected Annual Costs
Based on ~$1M/year Azure spend (migration plan v1.1):
| Metric | Cloud | On-Prem |
|---|---|---|
| Monthly Cost | ~$83,000 | ~$7,000 |
| Annual Cost | ~$1,000,000 | ~$84,000 |
| Annual Savings | ~$916,000 (92%) | - |
API Endpoints
GET /cost/vin/{vin}
Cost summary for a specific VIN.
GET /cost/fleet
Fleet-wide cost summary with top cost VINs.
GET /cost/summary?period=day|week|month
High-level cost summary for a time period.
GET /cost/comparison
Cloud vs on-prem cost comparison with projected annual savings.
GET /cost/report
Plain text report for terminal viewing.
Accessing the Report
The service is deployed internally on cec-prd-cluster-1 (no public ingress). To view the report:
# Quick one-liner
kubectl --context cec-prd-cluster-1 run curl-test --image=curlimages/curl --rm -it --restart=Never -- curl -s http://cost.default.svc.cluster.local:8077/cost/report
# Or port-forward and curl locally
kubectl --context cec-prd-cluster-1 port-forward svc/cost 8077:8077 &
curl http://localhost:8077/cost/report
Example Report Output
╔══════════════════════════════════════════════════════════════════╗
║ COST SERVICE REPORT ║
╠══════════════════════════════════════════════════════════════════╣
║ Period: 2026-01-05 to 2026-02-05
╠══════════════════════════════════════════════════════════════════╣
║ FLEET OVERVIEW ║
║ ─────────────────────────────────────────────────────────────── ║
║ Active Vehicles: 3229
║ Cloud Cost: $9761.61
║ On-Prem Cost: $677.88
║ Savings: $9083.73 (93.1%)
╠══════════════════════════════════════════════════════════════════╣
║ RESOURCE USAGE MODEL ║
║ ─────────────────────────────────────────────────────────────── ║
║ Platform Base: 176 cores / 896 GB RAM (fixed)
║ Per-VIN Marginal: 50 millicores / 82 MB RAM
║ Total Fleet: 337.5 cores / 1154.3 GB RAM
╠══════════════════════════════════════════════════════════════════╣
║ COST FORMULA ║
║ ─────────────────────────────────────────────────────────────── ║
║ (Platform Base) + (Per-VIN × 3229 VINs) + Managed Services
╠══════════════════════════════════════════════════════════════════╣
║ COST RATES ║
║ ─────────────────────────────────────────────────────────────── ║
║ Cloud: CPU $0.30/core-hr Memory $0.080/GB-hr
║ On-Prem: CPU $0.02/core-hr Memory $0.005/GB-hr
║ Base Infra: Cloud $10.00/15min On-Prem $2.50/15min
╠══════════════════════════════════════════════════════════════════╣
║ ANNUAL PROJECTION (based on current usage) ║
║ ─────────────────────────────────────────────────────────────── ║
║ Cloud Annual: $117139.28
║ On-Prem Annual: $8134.50
║ Annual Savings: $109004.77
╚══════════════════════════════════════════════════════════════════╝
TOP COST VEHICLES:
VIN CPU (mc) RAM (MB) Cloud $ On-Prem $ Savings %
─────────────────── ──────── ──────── ────────── ────────── ────────
VCF1UBU21PG008884 100 164 14.10 1.04 92.6%
VCF1EBU24PG007242 100 164 14.10 1.04 92.6%
VCF1ZBU29PG006267 100 164 14.10 1.04 92.6%
VCF1EBU26PG007307 75 123 13.41 0.99 92.6%
VCF1EBU22PG011967 50 82 12.77 0.93 92.8%
...
Note: Report generated 2026-02-05. Costs accumulate over time as the collector runs every 15 minutes.
Configuration
| Env Var | Description | Default |
|---|---|---|
| CLICKHOUSE_HOST | Local CH for storing cost data | localhost |
| REMOTE_CLICKHOUSE_HOST | Dev cluster CH for VIN activity | - |
| COLLECTOR_INTERVAL_MINUTES | How often to collect metrics | 15 |
Limitations
- Resource estimates are approximations, not actual measurements
- Cost rates are simplified and don't reflect all real-world factors
- On-prem costs exclude significant operational overhead
- Designed for business case illustration, not precise billing