Skip to content

Zensical Sandbox Platform

A secure, production-grade sandboxed code execution platform for portfolio demonstrations. This project showcases DevOps, SRE, and Platform Engineering skills through real-world infrastructure automation, containerization, and security practices.

🎯 Project Overview

This platform enables portfolio visitors to execute code examples in isolated, secure environments with: - Docker-based containerization for strong isolation - Resource limits (CPU, memory, execution time) - Network isolation (no external access from sandboxes) - Rate limiting to prevent abuse - Infrastructure as Code with Terraform - Production-ready architecture

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                       Internet                              β”‚
β”‚         *.christosm.dev (Cloudflare Proxied)                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚ HTTPS (port 443)
                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Contabo VPS                              β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  Traefik (Reverse Proxy + Let's Encrypt TLS)          β”‚  β”‚
β”‚  β”‚  - Cloudflare DNS Challenge for certificates          β”‚  β”‚
β”‚  β”‚  - HTTP β†’ HTTPS redirect                              β”‚  β”‚
β”‚  β”‚  - Basic auth on admin services                       β”‚  β”‚
β”‚  β””β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚     β”‚          β”‚          β”‚          β”‚          β”‚            β”‚
β”‚     β–Ό          β–Ό          β–Ό          β–Ό          β–Ό            β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚Sandboxβ”‚ β”‚Grafana β”‚ β”‚Prometh.β”‚ β”‚Portain.β”‚ β”‚Traefik  β”‚    β”‚
β”‚  β”‚  API  β”‚ β”‚ :3000  β”‚ β”‚ :9090  β”‚ β”‚ :9000  β”‚ β”‚Dashboardβ”‚    β”‚
β”‚  β””β”€β”€β”¬β”€β”€β”€β”˜ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚     β”‚         β”‚                                             β”‚
β”‚     β”‚         β”‚ queries                                     β”‚
β”‚     β”‚         β–Ό                                             β”‚
β”‚     β”‚    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                       β”‚
β”‚     β”‚    β”‚  Loki   │◄───│ Promtail β”‚ (container + host     β”‚
β”‚     β”‚    β”‚ (logs)  β”‚    β”‚ (agent)  β”‚  log collection)       β”‚
β”‚     β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                       β”‚
β”‚     β”‚                                                       β”‚
β”‚     β”‚  Creates sandboxes                                    β”‚
β”‚     β–Ό                                                       β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”‚
β”‚  β”‚  Ephemeral Containers                 β”‚                 β”‚
β”‚  β”‚  - Python 3.11  - Node.js 18         β”‚                 β”‚
β”‚  β”‚  - Bash 5.2                          β”‚                 β”‚
β”‚  β”‚  Security: No network, read-only FS, β”‚                 β”‚
β”‚  β”‚  resource limits, dropped caps        β”‚                 β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

🌐 Live Services

Service URL Authentication
Sandbox API https://api.christosm.dev None (public)
Grafana https://grafana.christosm.dev Grafana login
Prometheus https://prometheus.christosm.dev Basic auth
Traefik Dashboard https://traefik.christosm.dev Basic auth
Portainer https://portainer.christosm.dev Portainer login

πŸ” Security Features

Container Isolation

  • Network Isolation: network_mode: none - containers have no network access
  • Read-only Root: Filesystem is read-only except for /tmp (50MB limit)
  • Capability Dropping: All Linux capabilities dropped with cap_drop: ALL
  • No New Privileges: Prevents privilege escalation
  • Process Limits: Maximum 50 processes per container

Resource Constraints

  • CPU: Limited to 50% of one core
  • Memory: Hard limit of 256MB (no swap)
  • Execution Time: 30-second timeout (configurable)
  • Output Size: 10KB maximum output

Application Security

  • Rate Limiting: 10 requests per 60 seconds per IP address
  • Input Validation: Code length limits, dangerous pattern detection
  • Automatic Cleanup: Containers removed after execution
  • Logging: All executions logged with metadata

πŸš€ Quick Start

Prerequisites

  • Docker and Docker Compose installed
  • Python 3.8+ (for test client)
  • Terraform 1.5+ (for infrastructure provisioning)
  • SSH access to your VPS (for deployment)

Local Development

  1. Clone the repository:

    git clone <your-repo-url>
    cd zensical-sandbox-platform
    

  2. Start the services:

    docker-compose up --build
    

  3. Verify the API is running:

    curl http://localhost:8000/health
    

  4. Run test examples:

    python3 test_client.py
    

Interactive Testing

python3 test_client.py --interactive

Then try commands like:

>>> python print("Hello, World!")
>>> node console.log("Hello from Node!")
>>> bash echo "Hello from Bash!"
>>> health
>>> stats

πŸ“¦ Deployment to VPS

Step 1: Configure Terraform Variables

cd terraform
cp terraform.tfvars.example terraform.tfvars

Edit terraform.tfvars with your VPS details:

vps_host             = "YOUR_VPS_IP"
vps_user             = "root"
ssh_private_key_path = "~/.ssh/id_rsa"

Step 2: Initialize Terraform

terraform init

Step 3: Review the Plan

terraform plan

This will show you: - Docker installation steps - Firewall configuration - Application deployment - Systemd service setup

Step 4: Deploy

terraform apply

Type yes when prompted. Terraform will: 1. Install Docker and dependencies 2. Configure UFW firewall 3. Copy application files 4. Pull Docker images 5. Deploy the application 6. Configure systemd for auto-restart

Step 5: Verify Deployment

# Check health endpoint
curl http://YOUR_VPS_IP:8000/health

# Or use the test client
python3 test_client.py

πŸ”§ API Reference

Base URL

Local: http://localhost:8000 Production: https://api.christosm.dev

Endpoints

GET /health

Health check endpoint

Response:

{
  "status": "healthy",
  "docker_available": true,
  "timestamp": "2024-02-04T10:30:00Z",
  "active_containers": 0
}

GET /environments

List supported execution environments

Response:

{
  "environments": ["python", "node", "bash"],
  "details": {
    "python": {
      "image": "python:3.11-slim",
      "file_extension": "py"
    },
    "node": {
      "image": "node:18-alpine",
      "file_extension": "js"
    },
    "bash": {
      "image": "bash:5.2-alpine3.19",
      "file_extension": "sh"
    }
  }
}

POST /execute

Execute code in sandbox environment

Request:

{
  "code": "print('Hello, World!')",
  "environment": "python",
  "timeout": 30
}

Response (Success):

{
  "execution_id": "uuid-here",
  "status": "success",
  "output": "Hello, World!\n",
  "error": null,
  "execution_time": 0.523,
  "environment": "python",
  "timestamp": "2024-02-04T10:30:00Z"
}

Response (Error):

{
  "execution_id": "uuid-here",
  "status": "error",
  "output": null,
  "error": "Error message here",
  "execution_time": 0.123,
  "environment": "python",
  "timestamp": "2024-02-04T10:30:00Z"
}

Response (Timeout):

{
  "execution_id": "uuid-here",
  "status": "timeout",
  "output": "Partial output...",
  "error": null,
  "execution_time": 30.0,
  "environment": "python",
  "timestamp": "2024-02-04T10:30:00Z"
}

GET /stats

Platform statistics

Response:

{
  "active_executions": 2,
  "total_containers": 5,
  "supported_environments": ["python", "node", "bash"],
  "rate_limit": {
    "window_seconds": 60,
    "max_requests": 10
  },
  "resource_limits": {
    "memory": "256m",
    "cpu_percent": 50.0,
    "timeout_seconds": 30
  },
  "timestamp": "2024-02-04T10:30:00Z"
}

🎨 Frontend Integration

JavaScript/React Example

const API_BASE_URL = 'https://api.christosm.dev';

async function executeSandbox(code, environment) {
  try {
    const response = await fetch(`${API_BASE_URL}/execute`, {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        code: code,
        environment: environment,
        timeout: 30
      })
    });

    if (!response.ok) {
      throw new Error(`API error: ${response.status}`);
    }

    const result = await response.json();
    return result;
  } catch (error) {
    console.error('Sandbox execution failed:', error);
    throw error;
  }
}

// Usage
executeSandbox('print("Hello!")', 'python')
  .then(result => {
    if (result.status === 'success') {
      console.log('Output:', result.output);
    } else {
      console.error('Error:', result.error);
    }
  });

HTML/Vanilla JS Example

See frontend-integration/example.html for a complete working example.

πŸ“Š Monitoring and Observability

The platform runs a full observability stack accessible via Grafana:

Component Role Retention
Prometheus Metrics collection (Traefik, Sandbox API, Loki) 15 days
Loki Log aggregation (container + host logs) 15 days (360h)
Promtail Log collection agent (Docker SD + host syslog/auth) N/A
Grafana Visualization and dashboards Persistent

Grafana Queries

# Container logs (all services)
{job="docker-sd"}

# Filter by service name
{service="sandbox-api"}

# Host syslog
{job="syslog"}

# SSH and authentication logs
{job="authlog"}

Prometheus Scrape Targets

  • prometheus:9090 - Self-monitoring
  • traefik:8080 - Reverse proxy metrics
  • sandbox-api:8000 - Application metrics
  • loki:3100 - Log aggregation metrics

Maintenance Commands

# View service logs
docker compose logs -f sandbox-api

# Check all service status
docker compose ps

# Restart a specific service
docker compose restart grafana

# Clean up stopped containers and unused images
docker container prune -f
docker image prune -a -f

# Check resource usage
docker stats

πŸ› οΈ Configuration

Adjusting Resource Limits

Edit docker-compose.yml:

services:
  sandbox-api:
    environment:
      - MEMORY_LIMIT=512m        # Increase memory
      - CPU_QUOTA=100000         # Increase CPU (100%)
      - EXECUTION_TIMEOUT=60     # Increase timeout

Adding New Environments

Edit backend/main.py:

SUPPORTED_ENVIRONMENTS = {
    "python": {...},
    "node": {...},
    "bash": {...},
    "ruby": {  # New environment
        "image": "ruby:3.2-alpine",
        "command_template": "ruby -e '{code}'",
        "file_extension": "rb"
    }
}

SSL/TLS Configuration

TLS is handled by Traefik with Let's Encrypt certificates via Cloudflare DNS challenge:

  • Certificate Resolver: Let's Encrypt (ACME) with Cloudflare DNS-01 challenge
  • Cloudflare Proxy: All subdomains are proxied through Cloudflare (orange cloud)
  • HTTP Redirect: All HTTP traffic is automatically redirected to HTTPS
  • Admin Auth: Traefik dashboard and Prometheus are protected with basic auth middleware

Required environment variables (see .env.example):

CF_DNS_API_TOKEN=<cloudflare-api-token-with-dns-edit>
TRAEFIK_BASIC_AUTH=<htpasswd-generated-user:hash>

πŸŽ“ Learning Outcomes

This project demonstrates:

DevOps Skills

  • βœ… Container orchestration with Docker Compose
  • βœ… Infrastructure as Code with Terraform
  • βœ… CI/CD pipeline concepts (automated deployment)
  • βœ… Security hardening and best practices

SRE Skills

  • βœ… Service reliability patterns (health checks, timeouts)
  • βœ… Resource management and limits
  • βœ… Monitoring and observability (logging, metrics)
  • βœ… Incident prevention (rate limiting, input validation)

Platform Engineering Skills

  • βœ… Building developer-facing APIs
  • βœ… Multi-tenancy and isolation
  • βœ… Self-service infrastructure
  • βœ… Production-grade architecture

Cloud Engineering Skills

  • βœ… VPS/cloud server management
  • βœ… Network security configuration
  • βœ… Automated provisioning
  • βœ… Scalable system design

πŸ”œ Future Enhancements (Phase 2+)

Kubernetes Migration

  • Namespace-based isolation
  • ResourceQuotas and LimitRanges
  • NetworkPolicies for enhanced security
  • Horizontal Pod Autoscaling

Advanced Security

  • gVisor runtime for enhanced isolation
  • Seccomp and AppArmor profiles
  • Admission webhooks for policy enforcement
  • Secret management with Vault

Monitoring & Observability

  • Prometheus metrics βœ… Implemented
  • Grafana dashboards βœ… Implemented
  • Log aggregation βœ… Implemented (Loki + Promtail)
  • Distributed tracing with Jaeger
  • Alertmanager for alert routing

Developer Experience

  • WebSocket for real-time output
  • Multi-file execution support
  • Package installation support
  • Collaborative features

πŸ“ Interview Talking Points

For Aerospace/Defense Roles

  • "Built a secure, sandboxed execution environment with defense-in-depth security"
  • "Implemented network isolation, capability dropping, and resource constraints"
  • "Demonstrated security-conscious development practices suitable for cleared environments"

For Platform Engineering Roles

  • "Created a multi-tenant developer platform with API-first design"
  • "Implemented self-service infrastructure for code execution"
  • "Built production-grade isolation with Docker and resource management"

For DevOps/SRE Roles

  • "Automated infrastructure provisioning with Terraform"
  • "Implemented comprehensive monitoring, logging, and health checks"
  • "Designed for reliability with timeouts, rate limiting, and graceful degradation"

For Cloud Engineering Roles

  • "Deployed containerized applications to cloud infrastructure"
  • "Implemented scalable architecture suitable for cloud-native deployment"
  • "Used IaC for reproducible, version-controlled infrastructure"

🀝 Contributing

This is a portfolio project, but feedback and suggestions are welcome! Open an issue or submit a pull request.

πŸ“„ License

MIT License - See LICENSE file for details

πŸ“ž Contact


Built by Christos | Demonstrating DevOps, SRE, and Platform Engineering Skills