Documentation Index
Fetch the complete documentation index at: https://maps.solvice.io/llms.txt
Use this file to discover all available pages before exploring further.
Solvice Maps: Infrastructure and Deployment Guide
Infrastructure Overview
Solvice Maps runs on Google Cloud Platform (GCP) using a modern cloud-native architecture designed for high availability, scalability, and operational excellence. The infrastructure supports both real-time routing services and batch processing workloads with automatic scaling and comprehensive monitoring.
Cloud Architecture
Project Structure:
- Primary Project:
solver-285414
- Primary Region:
europe-west1 (Belgium)
- Availability Zone:
europe-west1-b
- Secondary Regions: Available for multi-region deployment
Key GCP Services Used:
- Compute: Google Compute Engine + Google Kubernetes Engine
- Storage: Cloud Storage for large results and OSRM map data
- Database: Cloud SQL (PostgreSQL) for request metadata
- Messaging: Cloud Pub/Sub for event-driven processing
- Networking: Global Load Balancer with Cloud CDN
- Monitoring: Cloud Monitoring + Cloud Logging
- Security: Cloud IAM + Secret Manager
Service Deployment Architecture
1. MapR Gateway Service (Primary API)
Deployment Platform: Google Cloud Run
- Runtime: JVM 17 with Quarkus native compilation
- Container: Distroless base image for security
- Scaling: 0-100 instances with request-based auto-scaling
- Cold Start: < 100ms with native compilation
Resource Configuration:
resources:
limits:
cpu: "2"
memory: "4Gi"
requests:
cpu: "0.5"
memory: "1Gi"
concurrency: 100
timeout: 300s
Environment Variables:
# Database Connection
DATABASE_URL=postgresql://user:pass@host:5432/mapr_gateway
DB_MAX_POOL_SIZE=20
# External Service Endpoints
OSRM_SERVICE_URL=https://osrm-europe.solvice.io
TOMTOM_API_KEY=${TOMTOM_API_KEY}
GOOGLE_MAPS_API_KEY=${GOOGLE_MAPS_API_KEY}
# Pub/Sub Configuration
PUBSUB_PROJECT_ID=solver-285414
PUBSUB_TABLE_TOPIC=mapr-table-requests
PUBSUB_RESPONSE_TOPIC=mapr-table-responses
# Storage Configuration
STORAGE_BUCKET=mapr-gateway-results
STORAGE_SIGNED_URL_DURATION=3600
# Authentication
JWT_SECRET=${JWT_SECRET}
JWT_ISSUER=solvice-maps
2. OSRM Service (Routing Engine)
Deployment Platform: Google Kubernetes Engine (GKE)
- Cluster:
osrm-cluster (3 nodes, n1-highmem-2)
- Node Pool: Container-Optimized OS with SSD persistent disks
- Scaling: Horizontal Pod Autoscaler with custom metrics
Kubernetes Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nodejs-mapr-europe-car
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: nodejs-mapr-europe-car
template:
metadata:
labels:
app: nodejs-mapr-europe-car
spec:
initContainers:
- name: map-downloader
image: gcr.io/solver-285414/map-downloader:latest
volumeMounts:
- name: osrm-maps
mountPath: /maps
env:
- name: MAPS_BUCKET
value: "osrm-maps-europe"
- name: MAP_REGION
value: "europe"
containers:
- name: osrm-service
image: gcr.io/solver-285414/nodejs-mapr:latest
ports:
- containerPort: 3000
env:
- name: OSRM_MAPS
value: |
[{
"map": "europe",
"vehicle": "car",
"path": "/maps/europe-{{slice}}.osrm",
"slices": [0,1,2,3,4,5,6,7,8,9,10,11,12],
"mmap": true
}]
- name: PUBSUB_TABLE_SUBSCRIPTIONS
value: |
[{
"id": "europe-car-subscription",
"weight": 10,
"maxMessages": 2
}]
volumeMounts:
- name: osrm-maps
mountPath: /maps
readOnly: true
resources:
requests:
memory: "8Gi"
cpu: "2"
limits:
memory: "12Gi"
cpu: "4"
livenessProbe:
httpGet:
path: /v1/health
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /v1/health/ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
volumes:
- name: osrm-maps
persistentVolumeClaim:
claimName: osrm-maps-pvc
Auto-Scaling Configuration:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: osrm-service-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nodejs-mapr-europe-car
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
- type: External
external:
metric:
name: pubsub.googleapis.com/subscription/num_undelivered_messages
selector:
matchLabels:
resource.labels.subscription_id: "europe-car-subscription"
target:
type: AverageValue
averageValue: "30"
terraform/
├── backend.tf # Remote state configuration
├── provider.tf # GCP provider configuration
├── variables.tf # Input variables
├── container.tf # GCE + Container configuration
├── instance_template.tf # VM instance template
├── instance_group.tf # Managed instance group
├── autoscaler.tf # Auto-scaling configuration
├── loadbalancer.tf # Global load balancer
├── health_check.tf # Health check configuration
├── bucket.tf # Cloud Storage buckets
└── pubsub.tf # Pub/Sub topics and subscriptions
Instance Template:
resource "google_compute_instance_template" "osrm_template" {
name_prefix = "osrm-template-"
description = "Template for OSRM service instances"
machine_type = var.machine_type # n1-highmem-2
disk {
source_image = "cos-cloud/cos-stable"
disk_type = "pd-ssd"
disk_size_gb = 280
auto_delete = true
boot = true
}
disk {
source_image = var.data_disk_image # Custom image with OSRM data
disk_type = "pd-ssd"
disk_size_gb = 1400
auto_delete = false
boot = false
}
network_interface {
network = "default"
access_config {
nat_ip = null # Ephemeral IP
}
}
metadata = {
"gce-container-declaration" = module.gce-container.metadata_value
"google-logging-enabled" = "true"
"enable-guest-attributes" = "TRUE"
}
service_account {
email = var.service_account_email
scopes = ["https://www.googleapis.com/auth/cloud-platform"]
}
tags = ["http-server", "https-server"]
lifecycle {
create_before_destroy = true
}
}
Global Load Balancer:
resource "google_compute_global_forwarding_rule" "default" {
name = "osrm-global-forwarding-rule"
target = google_compute_target_http_proxy.default.id
port_range = "80"
ip_address = google_compute_global_address.default.address
}
resource "google_compute_target_http_proxy" "default" {
name = "osrm-target-proxy"
url_map = google_compute_url_map.default.id
}
resource "google_compute_url_map" "default" {
name = "osrm-url-map"
default_service = google_compute_backend_service.default.id
}
resource "google_compute_backend_service" "default" {
name = "osrm-backend-service"
protocol = "HTTP"
timeout_sec = 30
enable_cdn = true
load_balancing_scheme = "EXTERNAL"
backend {
group = google_compute_instance_group_manager.default.instance_group
balancing_mode = "UTILIZATION"
max_utilization = 0.8
}
health_checks = [google_compute_health_check.default.id]
}
Pub/Sub Configuration:
# Dynamic topic creation from JSON configuration
locals {
pubsub_config = jsondecode(var.pubsub_subscriptions_json)
}
resource "google_pubsub_topic" "table_topics" {
for_each = { for sub in local.pubsub_config : sub.id => sub }
name = "mapr-table-${each.value.id}"
message_retention_duration = "604800s" # 7 days
}
resource "google_pubsub_topic" "dead_letter_topics" {
for_each = { for sub in local.pubsub_config : sub.id => sub }
name = "mapr-table-${each.value.id}-dead-letter"
}
resource "google_pubsub_subscription" "table_subscriptions" {
for_each = { for sub in local.pubsub_config : sub.id => sub }
name = "mapr-table-${each.value.id}-subscription"
topic = google_pubsub_topic.table_topics[each.key].name
ack_deadline_seconds = 600
message_retention_duration = "604800s"
retain_acked_messages = false
retry_policy {
minimum_backoff = "10s"
maximum_backoff = "600s"
}
dead_letter_policy {
dead_letter_topic = google_pubsub_topic.dead_letter_topics[each.key].id
max_delivery_attempts = 5
}
}
Deployment Processes
1. CI/CD Pipeline (GitLab CI)
Pipeline Structure:
stages:
- validate
- test
- build
- deploy-staging
- integration-test
- deploy-production
variables:
DOCKER_DRIVER: overlay2
DOCKER_TLS_CERTDIR: "/certs"
# Terraform Validation
validate:
stage: validate
image: hashicorp/terraform:1.9
script:
- cd terraform
- terraform init -backend=false
- terraform validate
- terraform fmt -check
# Application Testing
test:
stage: test
image: node:22
script:
- cd osrm-service
- npm ci
- npm run test:unit
- npm run test:integration
# Container Build
build:
stage: build
image: docker:24
services:
- docker:24-dind
script:
- docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
# Staging Deployment
deploy-staging:
stage: deploy-staging
image: google/cloud-sdk:alpine
script:
- gcloud auth activate-service-account --key-file $GOOGLE_APPLICATION_CREDENTIALS
- gcloud config set project $GCP_PROJECT_ID
- cd terraform
- terraform init
- terraform workspace select staging
- terraform plan -var="image_tag=$CI_COMMIT_SHA"
- terraform apply -auto-approve -var="image_tag=$CI_COMMIT_SHA"
environment:
name: staging
url: https://staging-api.solvice.io
# Production Deployment (Manual)
deploy-production:
stage: deploy-production
image: google/cloud-sdk:alpine
script:
- gcloud auth activate-service-account --key-file $GOOGLE_APPLICATION_CREDENTIALS
- gcloud config set project $GCP_PROJECT_ID
- cd terraform
- terraform init
- terraform workspace select production
- terraform plan -var="image_tag=$CI_COMMIT_SHA"
- terraform apply -auto-approve -var="image_tag=$CI_COMMIT_SHA"
environment:
name: production
url: https://routing.solvice.io
when: manual
only:
- main
2. Zero-Downtime Deployment Strategy
Rolling Update Process:
- Health Check: Ensure all current instances are healthy
- New Instance Launch: Launch new instances with updated configuration
- Health Validation: Wait for new instances to pass health checks
- Traffic Migration: Gradually shift traffic to new instances
- Old Instance Termination: Terminate old instances after validation
- Rollback Plan: Automated rollback if health checks fail
Blue-Green Deployment for Critical Updates:
#!/bin/bash
# Blue-Green deployment script
# Deploy to blue environment
terraform workspace select blue
terraform apply -var="image_tag=$NEW_VERSION"
# Run health checks
./scripts/health-check.sh blue
# Switch traffic to blue
gcloud compute url-maps set-default-service $URL_MAP \
--default-service=$BLUE_BACKEND_SERVICE
# Monitor for 10 minutes
sleep 600
# If successful, cleanup green environment
if [ $? -eq 0 ]; then
terraform workspace select green
terraform destroy -auto-approve
echo "Deployment successful"
else
# Rollback to green
gcloud compute url-maps set-default-service $URL_MAP \
--default-service=$GREEN_BACKEND_SERVICE
echo "Deployment failed, rolled back"
exit 1
fi
3. Map Data Deployment
OSRM Map Update Process:
#!/bin/bash
# Map data update script
# Build new map data
./build-osrm-maps.sh $REGION $VERSION
# Create disk image
gcloud compute images create osrm-$REGION-$VERSION \
--source-disk=osrm-build-disk \
--source-disk-zone=europe-west1-b
# Update Terraform variable
export TF_VAR_data_disk_image="osrm-$REGION-$VERSION"
# Deploy with rolling update
terraform plan -var="data_disk_image=$TF_VAR_data_disk_image"
terraform apply -auto-approve
Monitoring and Alerting
1. Infrastructure Monitoring
Cloud Monitoring Metrics:
# Custom metric for OSRM request latency
- name: "osrm/request_duration_seconds"
description: "OSRM request processing time"
type: "histogram"
labels: ["method", "status", "region"]
# Custom metric for queue depth
- name: "pubsub/queue_depth"
description: "Number of undelivered messages"
type: "gauge"
labels: ["subscription", "topic"]
# Infrastructure metrics
- name: "compute/cpu_utilization"
- name: "compute/memory_utilization"
- name: "compute/disk_utilization"
Alerting Policies:
alertPolicy:
displayName: "High Response Time"
conditions:
- displayName: "Response time > 100ms"
conditionThreshold:
threshold: 0.1
comparison: COMPARISON_GREATER_THAN
metric: "osrm/request_duration_seconds"
aggregations:
- alignmentPeriod: "300s"
perSeriesAligner: ALIGN_RATE
notificationChannels:
- "projects/solver-285414/notificationChannels/slack-alerts"
- "projects/solver-285414/notificationChannels/pager-duty"
2. Application-Level Monitoring
Health Check Endpoints:
// Comprehensive health checks
@Get('/health')
async getHealth(): Promise<HealthStatus> {
return {
status: 'healthy',
timestamp: new Date().toISOString(),
services: {
database: await this.checkDatabase(),
osrm: await this.checkOSRM(),
pubsub: await this.checkPubSub(),
storage: await this.checkStorage()
},
metrics: {
activeConnections: this.getActiveConnections(),
queueDepth: await this.getQueueDepth(),
memoryUsage: process.memoryUsage()
}
};
}
Performance Metrics:
// Custom metrics collection
@Histogram('request_duration_seconds', ['method', 'status'])
private requestDuration: Histogram;
@Counter('requests_total', ['method', 'status'])
private requestsTotal: Counter;
@Gauge('active_requests', [])
private activeRequests: Gauge;
Security Configuration
1. Network Security
VPC Configuration:
resource "google_compute_network" "solvice_vpc" {
name = "solvice-maps-vpc"
auto_create_subnetworks = false
}
resource "google_compute_subnetwork" "private_subnet" {
name = "private-subnet"
ip_cidr_range = "10.0.1.0/24"
region = "europe-west1"
network = google_compute_network.solvice_vpc.id
private_ip_google_access = true
}
resource "google_compute_firewall" "allow_internal" {
name = "allow-internal"
network = google_compute_network.solvice_vpc.name
allow {
protocol = "tcp"
ports = ["80", "443", "3000"]
}
source_ranges = ["10.0.0.0/8"]
}
SSL/TLS Configuration:
resource "google_compute_managed_ssl_certificate" "default" {
name = "solvice-maps-ssl-cert"
managed {
domains = [
"routing.solvice.io",
"api.solvice.io"
]
}
}
resource "google_compute_target_https_proxy" "default" {
name = "solvice-https-proxy"
url_map = google_compute_url_map.default.id
ssl_certificates = [google_compute_managed_ssl_certificate.default.id]
}
2. IAM and Access Control
Service Account Configuration:
resource "google_service_account" "osrm_service_account" {
account_id = "osrm-service"
display_name = "OSRM Service Account"
description = "Service account for OSRM compute instances"
}
resource "google_project_iam_member" "osrm_storage_access" {
project = var.project_id
role = "roles/storage.objectViewer"
member = "serviceAccount:${google_service_account.osrm_service_account.email}"
}
resource "google_project_iam_member" "osrm_pubsub_access" {
project = var.project_id
role = "roles/pubsub.subscriber"
member = "serviceAccount:${google_service_account.osrm_service_account.email}"
}
Secret Management:
resource "google_secret_manager_secret" "api_keys" {
secret_id = "external-api-keys"
replication {
user_managed {
replicas {
location = "europe-west1"
}
}
}
}
resource "google_secret_manager_secret_version" "api_keys_version" {
secret = google_secret_manager_secret.api_keys.id
secret_data = jsonencode({
tomtom_api_key = var.tomtom_api_key
google_maps_api_key = var.google_maps_api_key
})
}
Disaster Recovery and Backup
1. Data Backup Strategy
Database Backups:
# Automated PostgreSQL backups
gcloud sql backups create \
--instance=mapr-gateway-db \
--description="Daily automated backup $(date +%Y-%m-%d)"
# Point-in-time recovery enabled
gcloud sql instances patch mapr-gateway-db \
--backup-start-time=02:00 \
--enable-bin-log
Configuration Backups:
# Terraform state backup
gsutil cp gs://terraform-state-bucket/terraform.tfstate \
gs://disaster-recovery-bucket/terraform-$(date +%Y%m%d).tfstate
# Container images backup
gcloud container images list-tags gcr.io/solver-285414/nodejs-mapr \
--limit=10 --format='get(digest)' | \
xargs -I {} gcloud container images add-tag \
gcr.io/solver-285414/nodejs-mapr@{} \
gcr.io/backup-project/nodejs-mapr:backup-$(date +%Y%m%d)
2. Multi-Region Deployment
Regional Failover Configuration:
# Primary region: europe-west1
# Secondary region: us-central1
resource "google_compute_instance_group_manager" "osrm_primary" {
name = "osrm-primary"
location = "europe-west1-b"
# ... primary configuration
}
resource "google_compute_instance_group_manager" "osrm_secondary" {
name = "osrm-secondary"
location = "us-central1-b"
# ... secondary configuration (standby)
}
resource "google_compute_health_check" "regional_failover" {
name = "regional-failover-check"
http_health_check {
port = 80
request_path = "/health"
}
check_interval_sec = 10
timeout_sec = 5
healthy_threshold = 2
unhealthy_threshold = 3
}
Cost Optimization
1. Resource Optimization
Preemptible Instances:
resource "google_compute_instance_template" "preemptible_template" {
name = "osrm-preemptible-template"
scheduling {
preemptible = true
automatic_restart = false
on_host_maintenance = "TERMINATE"
}
# Use preemptible instances for batch processing
machine_type = "n1-highmem-2"
}
Auto-Scaling Configuration:
resource "google_compute_autoscaler" "osrm_autoscaler" {
name = "osrm-autoscaler"
target = google_compute_instance_group_manager.default.id
autoscaling_policy {
max_replicas = 10
min_replicas = 1 # Scale to zero during off-hours
cooldown_period = 300
cpu_utilization {
target = 0.7
}
scaling_schedules {
name = "scale-down-nights"
description = "Scale down during off-hours"
schedule = "0 22 * * *" # 10 PM
time_zone = "Europe/Brussels"
min_required_replicas = 0
duration_sec = 28800 # 8 hours
}
}
}
This infrastructure provides a robust, scalable, and cost-effective foundation for the Solvice Maps platform, with comprehensive monitoring, security, and disaster recovery capabilities.