Skip to main content

OpenClaw on Azure — Zero-Trust AI Agent Deployment

657 words·4 mins
Table of Contents



Project Overview
#

A production-grade Zero-Trust infrastructure deployment for running OpenClaw — an autonomous AI coding assistant — on Microsoft Azure. The project provisions a secure, cost-optimized, and fully automated cloud environment where the AI agent runs 24/7 with no publicly exposed ports.

The entire infrastructure is defined in Terraform and uses cloud-init for automated VM bootstrapping. Secrets are managed entirely through Azure Key Vault with Managed Identity authentication — no credentials are ever stored in code, environment variables, or on disk.

Architecture
#

The infrastructure follows a Zero Trust network model:

  1. No public inbound access — traditional web ports (80, 443) are completely closed.
  2. Tailscale mesh VPN provides authenticated, encrypted access to the AI agent’s web UI and SSH.
  3. Emergency SSH (port 22) is restricted to a single allowed IP for debugging if Tailscale fails.
  4. Managed Identity grants the VM read-only access to Key Vault — no static credentials anywhere.

Deployment Flow
#

After terraform apply, the following orchestration occurs automatically:

  1. Terraform provisions the Resource Group, VNet, NSG, Key Vault, Managed Identity, and VM.
  2. Cloud-init runs on first boot — installs Docker, Chrome, and Azure CLI.
  3. The VM authenticates via Managed Identity (az login --identity) and fetches secrets from Key Vault.
  4. Tailscale connects the VM to the private mesh network, making the AI agent accessible securely.
  5. OpenClaw is installed and configured with the retrieved AI provider API keys.

Tech Stack
#

Layer Technology Purpose
Compute Azure VM (Standard_B2s) Burstable Ubuntu 24.04 LTS host for the AI agent
Networking VNet + NSG Private network with deny-all inbound rules
Ingress Tailscale Mesh VPN Zero-Trust authenticated access (no public ports)
Secrets Azure Key Vault Stores AI API keys and Tailscale auth keys securely
Identity User-Assigned Managed Identity Credential-free authentication to Azure services
Bootstrapping Cloud-init Automated VM setup: Docker, Chrome, Tailscale, Azure CLI
IaC Terraform (azurerm ~> 3.90) Full infrastructure automation with modular config
Cost Control Azure Consumption Budgets Alerting at 80% of ~$30/month budget

Security Highlights
#

  • Zero Open Ports: Ingress is exclusively via Tailscale — no web ports are exposed to the internet.
  • Managed Identity: The VM authenticates to Azure services without any stored credentials. The identity has read-only permissions scoped to the specific Key Vault.
  • Secret Separation: Infrastructure (Terraform) is separated from configuration (Key Vault secrets). API keys can be rotated in Key Vault without redeploying the VM.
  • Network Isolation: The NSG blocks all inbound traffic except SSH from a single whitelisted IP. The AI agent’s web UI (port 18789) is accessible only through the Tailscale tunnel.

Challenges
#

  • Managed Identity Boot-Race: The VM’s Managed Identity takes a few seconds to propagate after provisioning. Implemented a retry loop (30 attempts × 5s) in the cloud-init script to handle the eventual consistency of Azure’s identity service.
  • Secret Timing: The Key Vault is created empty by Terraform, but the cloud-init script expects secrets on first boot. Designed a self-healing flow where the VM retries or can be restarted after manual secret upload.
  • Compute Sizing: Balancing cost vs. capability for an AI agent that needs Docker, Chrome (for web browsing), and the agent runtime. Standard_B2s (2 vCPUs, 4GB RAM) provides burstable performance at ~$30/month.
  • VPN vs. Public Access: Evaluated Azure VPN Gateway, Application Gateway, and direct public IP. Chose Tailscale for zero infrastructure cost, simpler setup, and stronger zero-trust guarantees.

Key Learnings
#

  • Designing a Zero-Trust network architecture for an AI agent — shifting security from perimeter to identity
  • Implementing credential-free authentication using Azure Managed Identity and Key Vault
  • Using cloud-init for declarative, idempotent VM bootstrapping without shell scripts on disk
  • Evaluating compute trade-offs (VMs vs. containers vs. serverless) for stateful, long-running workloads
  • Configuring Tailscale as a lightweight, cost-free alternative to Azure VPN Gateway
  • Managing infrastructure-secret separation — Terraform defines the “shape,” Key Vault provides the “values”
  • Implementing budget alerts for cost governance on personal cloud projects

References
#