- Updated cost model to show: (Platform Base) + (Per-VIN × VINs) - Platform base: 176 cores / 896GB RAM (Kafka, ClickHouse, MongoDB, Redis, PostgreSQL, gateway, monitoring) - Per-VIN marginal: 50mc / 82MB per vehicle - Added RESOURCE USAGE MODEL and COST FORMULA sections to report - Added CPU (mc) and RAM (MB) columns to TOP COST VEHICLES table - Updated README with new report output - virtual-vehicle: documented Vault cert TTL error troubleshooting
184 lines
9.3 KiB
Markdown
184 lines
9.3 KiB
Markdown
# Cost Service
|
||
|
||
Per-vehicle cost estimation service for capacity planning and cloud vs on-prem comparison.
|
||
|
||
## Overview
|
||
|
||
This service estimates the cost of running cloud services per VIN by:
|
||
1. Querying vehicle activity from ClickHouse (message counts)
|
||
2. Estimating resource usage based on activity level
|
||
3. Applying cost rates for cloud vs on-prem hosting
|
||
4. Storing aggregated cost data for reporting
|
||
|
||
## Cost Estimation Methodology
|
||
|
||
### Cost Model
|
||
|
||
The cost model separates fixed platform costs from variable per-VIN costs:
|
||
|
||
```
|
||
Total Cost = Platform Base Cost + (Per-VIN Cost × Number of VINs) + Managed Services
|
||
```
|
||
|
||
Whether you have 100 vehicles or 100,000, you still need Kafka, databases, and gateway services running. That's your platform base cost. Then each additional vehicle adds a small marginal cost on top.
|
||
|
||
### Platform Base Resources (Fixed)
|
||
|
||
What it takes to run the cloud services platform:
|
||
|
||
| Component | CPU (cores) | Memory (GB) | Notes |
|
||
|-----------|-------------|-------------|-------|
|
||
| Kafka brokers | 32 | 128 | 3-node cluster |
|
||
| ClickHouse | 64 | 256 | 3 shards for HA |
|
||
| MongoDB | 16 | 128 | Replica set |
|
||
| Redis | 16 | 128 | Cluster mode |
|
||
| PostgreSQL | 32 | 128 | Primary + replicas |
|
||
| Gateway services | 8 | 64 | API gateway, auth |
|
||
| Monitoring/logging | 8 | 64 | Prometheus, Grafana, Loki |
|
||
| **Total Platform Base** | **176** | **896** | |
|
||
|
||
### Per-VIN Resources (Marginal)
|
||
|
||
Incremental resources needed for each additional connected vehicle:
|
||
|
||
| Activity Level | Messages/15min | CPU (millicores) | Memory (MB) |
|
||
|---------------|----------------|------------------|-------------|
|
||
| Low | < 100 | 50 | 80 |
|
||
| Medium | 100-1000 | 75 | 120 |
|
||
| High | > 1000 | 100 | 160 |
|
||
|
||
### Cost Rates
|
||
|
||
| Resource | Cloud (Azure) | On-Prem/Bare Metal |
|
||
|----------|---------------|-------------------|
|
||
| CPU/core-hour | $0.30 | $0.02 |
|
||
| Memory/GB-hour | $0.08 | $0.005 |
|
||
| Managed Services/15min | $10.00 | $2.50 |
|
||
|
||
#### Why On-Prem is ~90% Cheaper
|
||
- **Platform base**: Same hardware, but cloud charges ~15x more for managed services
|
||
- **Per-VIN compute**: Cloud VMs cost ~15x more than amortized bare metal
|
||
- **Managed services**: Event Hubs, CosmosDB, etc. have significant markup vs self-hosted equivalents
|
||
|
||
### Savings Calculation
|
||
|
||
```
|
||
Platform Base Cost = (176 cores × rate + 896 GB × rate) × hours
|
||
Per-VIN Cost = (0.05 cores × rate + 0.08 GB × rate) × hours × activity_multiplier
|
||
Total Cost = Platform Base + (Per-VIN × VIN count) + Managed Services
|
||
|
||
Cloud Cost = Total with cloud rates
|
||
On-Prem Cost = Total with on-prem rates
|
||
Savings = Cloud Cost - On-Prem Cost
|
||
Savings % = (Savings / Cloud Cost) × 100
|
||
```
|
||
|
||
Expected savings: **~90%** with on-prem/bare metal hosting.
|
||
|
||
### Projected Annual Costs (5000 vehicles)
|
||
|
||
Based on ~$100k/month cloud spend:
|
||
|
||
| Metric | Cloud | On-Prem |
|
||
|--------|-------|---------|
|
||
| Monthly Cost | ~$100,000 | ~$9,500 |
|
||
| Annual Cost | ~$1,200,000 | ~$114,000 |
|
||
| Per Vehicle/Month | ~$20.00 | ~$1.90 |
|
||
| Annual Savings | ~$1,086,000 (90%) | - |
|
||
|
||
## API Endpoints
|
||
|
||
### GET /cost/vin/{vin}
|
||
Cost summary for a specific VIN.
|
||
|
||
### GET /cost/fleet
|
||
Fleet-wide cost summary with top cost VINs.
|
||
|
||
### GET /cost/summary?period=day|week|month
|
||
High-level cost summary for a time period.
|
||
|
||
### GET /cost/comparison
|
||
Cloud vs on-prem cost comparison with projected annual savings.
|
||
|
||
### GET /cost/report
|
||
Plain text report for terminal viewing.
|
||
|
||
## Accessing the Report
|
||
|
||
The service is deployed internally on cec-prd-cluster-1 (no public ingress). To view the report:
|
||
|
||
```bash
|
||
# Quick one-liner
|
||
kubectl --context cec-prd-cluster-1 run curl-test --image=curlimages/curl --rm -it --restart=Never -- curl -s http://cost.default.svc.cluster.local:8077/cost/report
|
||
|
||
# Or port-forward and curl locally
|
||
kubectl --context cec-prd-cluster-1 port-forward svc/cost 8077:8077 &
|
||
curl http://localhost:8077/cost/report
|
||
```
|
||
|
||
## Example Report Output
|
||
|
||
```
|
||
╔══════════════════════════════════════════════════════════════════╗
|
||
║ COST SERVICE REPORT ║
|
||
╠══════════════════════════════════════════════════════════════════╣
|
||
║ Period: 2026-01-05 to 2026-02-05
|
||
╠══════════════════════════════════════════════════════════════════╣
|
||
║ FLEET OVERVIEW ║
|
||
║ ─────────────────────────────────────────────────────────────── ║
|
||
║ Active Vehicles: 3229
|
||
║ Cloud Cost: $9761.61
|
||
║ On-Prem Cost: $677.88
|
||
║ Savings: $9083.73 (93.1%)
|
||
╠══════════════════════════════════════════════════════════════════╣
|
||
║ RESOURCE USAGE MODEL ║
|
||
║ ─────────────────────────────────────────────────────────────── ║
|
||
║ Platform Base: 176 cores / 896 GB RAM (fixed)
|
||
║ Per-VIN Marginal: 50 millicores / 82 MB RAM
|
||
║ Total Fleet: 337.5 cores / 1154.3 GB RAM
|
||
╠══════════════════════════════════════════════════════════════════╣
|
||
║ COST FORMULA ║
|
||
║ ─────────────────────────────────────────────────────────────── ║
|
||
║ (Platform Base) + (Per-VIN × 3229 VINs) + Managed Services
|
||
╠══════════════════════════════════════════════════════════════════╣
|
||
║ COST RATES ║
|
||
║ ─────────────────────────────────────────────────────────────── ║
|
||
║ Cloud: CPU $0.30/core-hr Memory $0.080/GB-hr
|
||
║ On-Prem: CPU $0.02/core-hr Memory $0.005/GB-hr
|
||
║ Base Infra: Cloud $10.00/15min On-Prem $2.50/15min
|
||
╠══════════════════════════════════════════════════════════════════╣
|
||
║ ANNUAL PROJECTION (based on current usage) ║
|
||
║ ─────────────────────────────────────────────────────────────── ║
|
||
║ Cloud Annual: $117139.28
|
||
║ On-Prem Annual: $8134.50
|
||
║ Annual Savings: $109004.77
|
||
╚══════════════════════════════════════════════════════════════════╝
|
||
|
||
TOP COST VEHICLES:
|
||
VIN CPU (mc) RAM (MB) Cloud $ On-Prem $ Savings %
|
||
─────────────────── ──────── ──────── ────────── ────────── ────────
|
||
VCF1UBU21PG008884 100 164 14.10 1.04 92.6%
|
||
VCF1EBU24PG007242 100 164 14.10 1.04 92.6%
|
||
VCF1ZBU29PG006267 100 164 14.10 1.04 92.6%
|
||
VCF1EBU26PG007307 75 123 13.41 0.99 92.6%
|
||
VCF1EBU22PG011967 50 82 12.77 0.93 92.8%
|
||
...
|
||
```
|
||
|
||
*Note: Report generated 2026-02-05. Costs accumulate over time as the collector runs every 15 minutes.*
|
||
|
||
## Configuration
|
||
|
||
| Env Var | Description | Default |
|
||
|---------|-------------|---------|
|
||
| CLICKHOUSE_HOST | Local CH for storing cost data | localhost |
|
||
| REMOTE_CLICKHOUSE_HOST | Dev cluster CH for VIN activity | - |
|
||
| COLLECTOR_INTERVAL_MINUTES | How often to collect metrics | 15 |
|
||
|
||
## Limitations
|
||
|
||
- Resource estimates are approximations, not actual measurements
|
||
- Cost rates are simplified and don't reflect all real-world factors
|
||
- On-prem costs exclude significant operational overhead
|
||
- Designed for business case illustration, not precise billing
|