Nebula Daemon

Central control plane server for managing the Nebula platform. It orchestrates GPU agents, handles model deployments, and provides an OpenAI-compatible API gateway.

Features

Core Daemon (`cmd/nebulad/main.go`)

CLI based on urfave/cli with flag and environment variable support
Structured logging (slog) with JSON and text format support
Graceful shutdown handling
gRPC interceptors for logging and panic recovery
gRPC reflection for debugging

gRPC Server (Port 9090)

Agent registration and deregistration
Heartbeat processing with offline detection
Command queuing for agents
Bidirectional communication with nebula-agent instances

HTTP Gateway (Port 8080)

OpenAI-compatible REST API
Chat completions with streaming support (SSE)
Text completions and embeddings
Model listing and health checks
CORS support for web clients

Registry Service

In-memory agent state tracking
Automatic offline detection (90s timeout)
Pending commands queue for agents
Caddy and Route53 integration for dynamic routing

Deployment Service

Full deployment lifecycle management
Automatic node scheduling
Runtime adapter support (vLLM, TGI, Ollama)
Container logs and events streaming

Storage

SQLite database for persistent state
Node and deployment tracking
Event history

Usage

Running the Daemon

# Basic usage with defaults
./nebulad

# With custom ports
./nebulad --grpc-port 9090 --http-port 8080

# With Caddy integration
./nebulad --caddy-admin-url http://localhost:2019 --gateway-domain gateway.example.com

# With Route53 DNS management
./nebulad \
  --route53-hosted-zone-id Z1234567890ABC \
  --route53-access-key-id AKIA... \
  --route53-secret-access-key ... \
  --route53-region us-east-1 \
  --gateway-ip 10.0.0.1

# With debug logging
./nebulad --log-level debug --log-format json

CLI Options

Option	Description	Default
`--data-dir`	Data directory for SQLite database	`/var/lib/nebulad`
`--grpc-port`	gRPC server port for agent communication	`9090`
`--http-port`	HTTP API server port	`8080`
`--api-prefix`	API prefix for HTTP endpoints	`/v1`
`--caddy-admin-url`	Caddy Admin API URL	`http://localhost:2019`
`--gateway-domain`	Base domain for agent gateways	`gateway.nebulactl.com`
`--heartbeat-timeout`	Seconds before marking agent offline	`90`
`--default-heartbeat-interval`	Default heartbeat interval for agents	`30`
`--log-level`	Log level: debug, info, warn, error	`info`
`--log-format`	Log format: json, text	`text`
`--route53-hosted-zone-id`	AWS Route 53 hosted zone ID	-
`--route53-access-key-id`	AWS access key ID	-
`--route53-secret-access-key`	AWS secret access key	-
`--route53-region`	AWS region	`us-east-1`
`--route53-ttl`	DNS record TTL in seconds	`300`
`--gateway-ip`	Gateway IP for DNS A records	-

Configuration

Environment Variables

All CLI options can be set via environment variables with the NEBULAD_ prefix:

Variable	Description
`NEBULAD_DATA_DIR`	Data directory path
`NEBULAD_GRPC_PORT`	gRPC server port
`NEBULAD_HTTP_PORT`	HTTP API port
`NEBULAD_API_PREFIX`	API prefix
`NEBULAD_CADDY_ADMIN_URL`	Caddy Admin API URL
`NEBULAD_GATEWAY_DOMAIN`	Gateway base domain
`NEBULAD_HEARTBEAT_TIMEOUT`	Offline timeout
`NEBULAD_DEFAULT_HEARTBEAT_INTERVAL`	Heartbeat interval
`NEBULAD_LOG_LEVEL`	Log level
`NEBULAD_LOG_FORMAT`	Log format
`NEBULAD_ROUTE53_HOSTED_ZONE_ID`	Route 53 zone ID
`NEBULAD_ROUTE53_ACCESS_KEY_ID`	AWS access key
`NEBULAD_ROUTE53_SECRET_ACCESS_KEY`	AWS secret key
`NEBULAD_ROUTE53_REGION`	AWS region
`NEBULAD_ROUTE53_TTL`	DNS TTL
`NEBULAD_GATEWAY_IP`	Gateway IP

Example: Production Configuration

export NEBULAD_DATA_DIR=/var/lib/nebulad
export NEBULAD_GRPC_PORT=9090
export NEBULAD_HTTP_PORT=8080
export NEBULAD_CADDY_ADMIN_URL=http://localhost:2019
export NEBULAD_GATEWAY_DOMAIN=gateway.mycompany.com
export NEBULAD_LOG_LEVEL=info
export NEBULAD_LOG_FORMAT=json

# Optional: Route53 DNS management
export NEBULAD_ROUTE53_HOSTED_ZONE_ID=Z1234567890ABC
export NEBULAD_ROUTE53_ACCESS_KEY_ID=AKIA...
export NEBULAD_ROUTE53_SECRET_ACCESS_KEY=...
export NEBULAD_ROUTE53_REGION=us-east-1
export NEBULAD_GATEWAY_IP=10.0.0.1

./nebulad

gRPC API Reference

The daemon exposes a PlatformService for agent communication on the gRPC port (default 9090).

Service Definition

service PlatformService {
  rpc Register(RegisterRequest) returns (RegisterResponse);
  rpc Heartbeat(HeartbeatRequest) returns (HeartbeatResponse);
  rpc Deregister(DeregisterRequest) returns (DeregisterResponse);
}

Register RPC

Registers a new agent or re-registers an existing one.

Request:

message RegisterRequest {
  string node_id = 1;           // UUID (auto-generated if empty)
  string node_name = 2;         // Human-readable node name
  string version = 3;           // Agent version
  string agent_address = 4;     // host:port for callbacks
  NodeCapabilities capabilities = 5;
  repeated DeploymentInfo deployments = 6;
}

message NodeCapabilities {
  repeated GPUInfo gpus = 1;
  int64 total_memory_bytes = 2;
  int64 available_memory_bytes = 3;
  string os = 4;
  string arch = 5;
  int32 cpu_cores = 6;
  string cuda_version = 7;
}

Response:

message RegisterResponse {
  string node_id = 1;           // Assigned node ID
  int32 heartbeat_interval = 2; // Heartbeat interval in seconds
  string gateway_subdomain = 3; // Assigned subdomain (e.g., abc123.gateway.nebulactl.com)
}

Example with grpcurl:

grpcurl -plaintext -d '{
  "node_name": "gpu-node-1",
  "version": "0.1.0",
  "agent_address": "192.168.1.100:9091",
  "capabilities": {
    "gpus": [
      {"uuid": "GPU-abc123", "name": "NVIDIA A100", "memory_bytes": 42949672960}
    ],
    "total_memory_bytes": 137438953472,
    "os": "linux",
    "arch": "amd64",
    "cpu_cores": 32,
    "cuda_version": "12.1"
  }
}' localhost:9090 proto.PlatformService/Register

Heartbeat RPC

Sends periodic status updates from agents and receives pending commands.

Request:

message HeartbeatRequest {
  string node_id = 1;
  NodeStats stats = 2;
  HealthStatus health = 3;
  repeated DeploymentInfo deployments = 4;
  int64 timestamp = 5;
}

message NodeStats {
  float cpu_usage_percent = 1;
  int64 memory_used_bytes = 2;
  int64 memory_total_bytes = 3;
  repeated GPUStats gpu_stats = 4;
}

message GPUStats {
  string uuid = 1;
  float utilization_percent = 2;
  int64 memory_used_bytes = 3;
  int64 memory_total_bytes = 4;
  float temperature_celsius = 5;
  float power_watts = 6;
}

message HealthStatus {
  string status = 1;  // "healthy", "degraded", "unhealthy"
  repeated string issues = 2;
  int64 uptime_seconds = 3;
}

Response:

message HeartbeatResponse {
  bool acknowledged = 1;
  NodeConfig config = 2;
  repeated Command commands = 3;  // Pending commands for agent
}

message Command {
  string id = 1;
  string type = 2;      // "start", "stop", "restart", "delete"
  string target_id = 3; // Deployment ID
  map<string, string> params = 4;
}

Example with grpcurl:

grpcurl -plaintext -d '{
  "node_id": "550e8400-e29b-41d4-a716-446655440000",
  "stats": {
    "cpu_usage_percent": 45.5,
    "memory_used_bytes": 68719476736,
    "memory_total_bytes": 137438953472,
    "gpu_stats": [
      {
        "uuid": "GPU-abc123",
        "utilization_percent": 80.0,
        "memory_used_bytes": 34359738368,
        "memory_total_bytes": 42949672960,
        "temperature_celsius": 72.0,
        "power_watts": 250.0
      }
    ]
  },
  "health": {
    "status": "healthy",
    "uptime_seconds": 86400
  },
  "timestamp": 1702400000
}' localhost:9090 proto.PlatformService/Heartbeat

Deregister RPC

Gracefully removes an agent from the platform.

Request:

message DeregisterRequest {
  string node_id = 1;
  string reason = 2;  // Optional reason for deregistration
}

Response:

message DeregisterResponse {
  bool success = 1;
  string message = 2;
}

Example with grpcurl:

grpcurl -plaintext -d '{
  "node_id": "550e8400-e29b-41d4-a716-446655440000",
  "reason": "graceful shutdown"
}' localhost:9090 proto.PlatformService/Deregister

HTTP API Reference (OpenAI-compatible)

The daemon provides an OpenAI-compatible HTTP API on the HTTP port (default 8080).

POST /v1/chat/completions

Generate chat completions using deployed models.

Request:

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <token>" \
  -d '{
    "model": "llama-3.1-8b",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello, how are you?"}
    ],
    "temperature": 0.7,
    "max_tokens": 2048,
    "stream": false
  }'

Request Body:

Field	Type	Required	Description
`model`	string	Yes	Model name (deployment name)
`messages`	array	Yes	Array of message objects
`temperature`	float	No	Sampling temperature (0-2)
`top_p`	float	No	Nucleus sampling parameter
`max_tokens`	int	No	Maximum tokens to generate
`stream`	bool	No	Enable streaming response
`stop`	array	No	Stop sequences
`presence_penalty`	float	No	Presence penalty (-2 to 2)
`frequency_penalty`	float	No	Frequency penalty (-2 to 2)

Response:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1702400000,
  "model": "llama-3.1-8b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! I'm doing well, thank you for asking. How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 18,
    "total_tokens": 43
  }
}

Streaming Response:

When stream: true, the response uses Server-Sent Events (SSE):

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <token>" \
  -d '{"model": "llama-3.1-8b", "messages": [{"role": "user", "content": "Hello"}], "stream": true}'

Response chunks:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1702400000,"model":"llama-3.1-8b","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1702400000,"model":"llama-3.1-8b","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1702400000,"model":"llama-3.1-8b","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1702400000,"model":"llama-3.1-8b","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

POST /v1/completions

Generate text completions (legacy API).

Request:

curl -X POST http://localhost:8080/v1/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <token>" \
  -d '{
    "model": "llama-3.1-8b",
    "prompt": "Once upon a time",
    "max_tokens": 100,
    "temperature": 0.7
  }'

Response:

{
  "id": "cmpl-abc123",
  "object": "text_completion",
  "created": 1702400000,
  "model": "llama-3.1-8b",
  "choices": [
    {
      "text": ", in a land far away, there lived a wise old wizard...",
      "index": 0,
      "finish_reason": "length"
    }
  ],
  "usage": {
    "prompt_tokens": 4,
    "completion_tokens": 100,
    "total_tokens": 104
  }
}

POST /v1/embeddings

Generate embeddings for text.

Request:

curl -X POST http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <token>" \
  -d '{
    "model": "text-embedding-model",
    "input": "The quick brown fox jumps over the lazy dog"
  }'

Response:

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [0.0023, -0.0094, 0.0156, ...],
      "index": 0
    }
  ],
  "model": "text-embedding-model",
  "usage": {
    "prompt_tokens": 9,
    "total_tokens": 9
  }
}

GET /v1/models

List all available models (deployed and ready).

Request:

curl http://localhost:8080/v1/models \
  -H "Authorization: Bearer <token>"

Response:

{
  "object": "list",
  "data": [
    {
      "id": "llama-3.1-8b",
      "object": "model",
      "created": 1702400000,
      "owned_by": "nebula"
    },
    {
      "id": "mistral-7b",
      "object": "model",
      "created": 1702400000,
      "owned_by": "nebula"
    }
  ]
}

GET /health

Health check endpoint.

Request:

curl http://localhost:8080/health

Response:

{
  "status": "ok"
}

Services

Registry Service

Manages agent registration, heartbeats, and health monitoring.

Location: platform/service/registry/service.go

Key Operations:

Operation	Description
`RegisterAgent`	Creates/updates node, configures Caddy/Route53 routes
`ProcessHeartbeat`	Updates LastHeartbeat, updates status in DB
`DeregisterAgent`	Removes Caddy/Route53 routes, marks as offline
`GetPendingCommands`	Returns and clears pending commands for agent
`QueueCommand`	Adds command for delivery on next heartbeat
`StartOfflineChecker`	Background goroutine checking offline agents

Agent State Structure:

type AgentState struct {
    NodeID        string
    LastHeartbeat time.Time
    Status        string  // "online", "offline"
    AgentAddress  string
    Version       string
    Capabilities  *NodeCapabilities
}

Offline Detection:

Background checker runs every 30 seconds
Agents are marked offline if no heartbeat received for 90 seconds
Offline agents have their Caddy routes removed

Deployment Service

Manages the full lifecycle of model deployments.

Location: platform/service/deployment/service.go

Key Operations:

Method	Description
`Deploy`	End-to-end: select node, start runtime on agent, create deployment
`List`	Get all deployments from all online nodes
`Get`	Get specific deployment by ID
`Start`	Restart a stopped deployment
`Stop`	Stop a running deployment
`Delete`	Delete deployment and stop container
`Restart`	Restart deployment (stop + start)
`GetLogs`	Get container logs
`GetEvents`	Get deployment event history
`ExecCommand`	Execute command inside container

Deployment Flow:

Client sends Deploy request
Generate Deployment ID (UUID)
Auto-detect ModelSource (ollama:// or hf://)
Select node via scheduler (if not specified)
Connect to agent via gRPC
Generate access token (if not public)
Send StartRuntimeRequest to agent
Return Deployment with endpoint info

Scheduler Logic:

GPU deployments: Select node with most available GPUs
CPU deployments: Select node with most available CPU cores
Validates GPU memory and type constraints
Enforces homogeneous GPU requirements for multi-GPU

Gateway Router

Routes incoming API requests to the appropriate model deployments.

Location: platform/gateway/router.go

Route Structure:

type Route struct {
    DeploymentID string  // Deployment ID
    Endpoint     string  // http://host:port
    Token        string  // Access token
    Runtime      string  // ollama, vllm, tgi
    IsPublic     bool    // Requires auth?
    Model        string  // Original model name
}

Routing Process:

Client sends POST /v1/chat/completions with model field
Router resolves model -> Deployment -> Node
Select appropriate Runtime adapter (Ollama/vLLM/TGI)
Adapter forwards request to endpoint with token
Result returned in OpenAI format

Runtime Adapters:

Runtime	Adapter	Notes
Ollama	OllamaAdapter	Custom format translation
vLLM	PassthroughAdapter	OpenAI-compatible
TGI	PassthroughAdapter	OpenAI-compatible
Triton	PassthroughAdapter	OpenAI-compatible

Integrations

Caddy Integration

Provides automatic HTTPS and reverse proxy for agent gateways.

Location: platform/service/caddy/client.go

Features:

Dynamic route creation via Caddy Admin API
Automatic TLS certificate management
Reverse proxy from subdomain to agent

How it works:

When an agent registers, nebulad creates a unique subdomain
Caddy is configured to proxy https://{subdomain}.gateway.nebulactl.com to the agent's auth proxy (port 8888)
All traffic is automatically encrypted with TLS

Example Route:

https://abc123.gateway.nebulactl.com -> http://agent-ip:8888

Requirements:

Caddy server running with Admin API enabled (port 2019)
Wildcard DNS record pointing to Caddy server

Route53 Integration

Optional AWS Route 53 integration for DNS management.

Location: platform/service/route53/client.go

Features:

Automatic A record creation for agent subdomains
Configurable TTL
Regional support

Configuration:

./nebulad \
  --route53-hosted-zone-id Z1234567890ABC \
  --route53-access-key-id AKIA... \
  --route53-secret-access-key ... \
  --route53-region us-east-1 \
  --route53-ttl 300 \
  --gateway-ip 10.0.0.1

How it works:

When an agent registers, nebulad creates an A record
Record points subdomain to the gateway IP
Combined with Caddy, enables full HTTPS access to agents

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                          nebulad                                │
│                                                                 │
│  ┌─────────────────┐  ┌─────────────────┐  ┌────────────────┐  │
│  │   gRPC Server   │  │  HTTP Gateway   │  │  SQLite Store  │  │
│  │     :9090       │  │     :8080       │  │                │  │
│  │                 │  │                 │  │  /var/lib/     │  │
│  │  - Register     │  │  - /v1/chat     │  │   nebulad/     │  │
│  │  - Heartbeat    │  │  - /v1/models   │  │   nebula.db    │  │
│  │  - Deregister   │  │  - /health      │  │                │  │
│  └────────┬────────┘  └────────┬────────┘  └───────┬────────┘  │
│           │                    │                    │           │
│  ┌────────┴────────────────────┴────────────────────┴────────┐  │
│  │                        Services                           │  │
│  │                                                           │  │
│  │  ┌──────────────┐  ┌────────────────┐  ┌───────────────┐ │  │
│  │  │   Registry   │  │   Deployment   │  │    Gateway    │ │  │
│  │  │   Service    │  │    Service     │  │    Router     │ │  │
│  │  │              │  │                │  │               │ │  │
│  │  │ - Agent      │  │ - Deploy       │  │ - Model       │ │  │
│  │  │   tracking   │  │ - Start/Stop   │  │   routing     │ │  │
│  │  │ - Offline    │  │ - Scheduling   │  │ - Runtime     │ │  │
│  │  │   detection  │  │ - Logs/Events  │  │   adapters    │ │  │
│  │  └──────────────┘  └────────────────┘  └───────────────┘ │  │
│  └───────────────────────────────────────────────────────────┘  │
│                                                                 │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │                 External Integrations                     │  │
│  │                                                           │  │
│  │  ┌────────────────────┐  ┌─────────────────────────────┐ │  │
│  │  │    Caddy Client    │  │      Route53 Client         │ │  │
│  │  │                    │  │                             │ │  │
│  │  │  - SSL termination │  │  - DNS A record management  │ │  │
│  │  │  - Reverse proxy   │  │  - Subdomain creation       │ │  │
│  │  │  - Dynamic routes  │  │  - (Optional)               │ │  │
│  │  └────────────────────┘  └─────────────────────────────┘ │  │
│  └───────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
                              │
                              │ gRPC (Agent Communication)
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                    GPU Nodes (nebula-agent)                     │
│                                                                 │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          │
│  │   Node 1     │  │   Node 2     │  │   Node 3     │   ...    │
│  │   4x A100    │  │   2x H100    │  │   8x A10G    │          │
│  └──────────────┘  └──────────────┘  └──────────────┘          │
└─────────────────────────────────────────────────────────────────┘

Directory Structure

platform/
├── api/
│   └── grpc/
│       ├── platform.proto          # Platform service definition
│       ├── agent.proto             # Agent service definition
│       ├── proto/                  # Generated protobuf files
│       └── server/
│           ├── platform_server.go  # gRPC server for agents
│           └── agent_server.go
│
├── gateway/
│   ├── server.go                   # HTTP Gateway server
│   ├── router.go                   # Model routing logic
│   ├── api/
│   │   └── types.go                # OpenAI API types
│   └── adapter/
│       ├── adapter.go              # Runtime adapter interface
│       ├── ollama.go               # Ollama adapter
│       └── passthrough.go          # Passthrough adapter
│
├── service/
│   ├── registry/
│   │   └── service.go              # Agent registration service
│   ├── deployment/
│   │   ├── service.go              # Deployment management
│   │   └── scheduler.go            # Node scheduling
│   ├── caddy/
│   │   └── client.go               # Caddy Admin API client
│   └── route53/
│       ├── client.go               # AWS Route 53 client
│       └── short_uuid.go           # Short UUID generation
│
├── storage/
│   ├── store.go                    # Storage interface
│   └── sqlite/
│       └── store.go                # SQLite implementation
│
├── domain/
│   ├── node.go                     # Node domain model
│   ├── deployment.go               # Deployment domain model
│   └── errors.go                   # Domain errors
│
└── client/
    └── grpc_client.go              # Client for agent communication

Deployment Scenarios

Development (Single Node)

# Terminal 1: Start nebulad
./nebulad --data-dir ./data --log-level debug

# Terminal 2: Start nebula-agent
./nebula-agent --control-plane localhost:9090

# Terminal 3: Deploy a model
./nebulactl deploy --model llama3.2:1b --runtime ollama

Production (with Caddy)

Setup Caddy with wildcard certificate:

{
    admin :2019
}

*.gateway.example.com {
    # Routes added dynamically by nebulad
}

Configure DNS: Create wildcard A record *.gateway.example.com pointing to Caddy server
Start nebulad:

./nebulad \
  --data-dir /var/lib/nebulad \
  --caddy-admin-url http://localhost:2019 \
  --gateway-domain gateway.example.com \
  --log-level info \
  --log-format json

Production (with Route53)

./nebulad \
  --data-dir /var/lib/nebulad \
  --caddy-admin-url http://localhost:2019 \
  --gateway-domain gateway.example.com \
  --route53-hosted-zone-id Z1234567890ABC \
  --route53-access-key-id AKIA... \
  --route53-secret-access-key ... \
  --route53-region us-east-1 \
  --gateway-ip 10.0.0.1 \
  --log-level info \
  --log-format json

Logging

The daemon uses structured logging (slog) at all levels.

Log Levels

Level	Description
`debug`	Detailed debugging information
`info`	General operational information
`warn`	Warning messages
`error`	Error conditions

Example Output (text format)

level=INFO msg="Starting Nebula Daemon" version=0.1.0
level=INFO msg="SQLite store initialized" path=/var/lib/nebulad/nebula.db
level=INFO msg="Caddy client initialized" url=http://localhost:2019
level=INFO msg="Registry service started"
level=INFO msg="Deployment service started"
level=INFO msg="HTTP Gateway starting" port=8080 prefix=/v1
level=INFO msg="gRPC server starting" port=9090
level=INFO msg="Nebula Daemon is ready"
level=INFO msg="Agent registered" node_id=550e8400-e29b-41d4-a716-446655440000 name=gpu-node-1
level=INFO msg="Heartbeat received" node_id=550e8400-e29b-41d4-a716-446655440000

Example Output (JSON format)

{"time":"2024-12-12T10:00:00Z","level":"INFO","msg":"Starting Nebula Daemon","version":"0.1.0"}
{"time":"2024-12-12T10:00:00Z","level":"INFO","msg":"Agent registered","node_id":"550e8400-e29b-41d4-a716-446655440000","name":"gpu-node-1"}

Building

Build for Current Platform

go build -o nebulad ./cmd/nebulad

Cross-compile for Linux

# Linux AMD64
GOOS=linux GOARCH=amd64 go build -o nebulad-linux-amd64 ./cmd/nebulad

# Linux ARM64
GOOS=linux GOARCH=arm64 go build -o nebulad-linux-arm64 ./cmd/nebulad

Build with Version Information

VERSION=$(git describe --tags --always)
BUILD_TIME=$(date -u +"%Y-%m-%dT%H:%M:%SZ")

go build -ldflags "-X main.Version=$VERSION -X main.BuildTime=$BUILD_TIME" \
  -o nebulad ./cmd/nebulad

System Requirements

Go 1.24+
SQLite (embedded, no external dependency)
Network access to GPU nodes
(Optional) Caddy server for HTTPS gateway
(Optional) AWS credentials for Route53 DNS management

Features​

Core Daemon (cmd/nebulad/main.go)​

gRPC Server (Port 9090)​

HTTP Gateway (Port 8080)​

Registry Service​

Deployment Service​

Storage​

Usage​

Running the Daemon​

CLI Options​

Configuration​

Environment Variables​

Example: Production Configuration​

gRPC API Reference​

Service Definition​

Register RPC​

Heartbeat RPC​

Deregister RPC​

HTTP API Reference (OpenAI-compatible)​

POST /v1/chat/completions​

POST /v1/completions​

POST /v1/embeddings​

GET /v1/models​

GET /health​

Services​

Registry Service​

Deployment Service​

Gateway Router​

Integrations​

Caddy Integration​

Route53 Integration​

Architecture​

Directory Structure​

Deployment Scenarios​

Development (Single Node)​

Production (with Caddy)​

Production (with Route53)​

Logging​

Log Levels​

Example Output (text format)​

Example Output (JSON format)​

Building​

Build for Current Platform​

Cross-compile for Linux​

Build with Version Information​

System Requirements​

Features

Core Daemon (`cmd/nebulad/main.go`)

gRPC Server (Port 9090)

HTTP Gateway (Port 8080)

Registry Service

Deployment Service

Storage

Usage

Running the Daemon

CLI Options

Configuration

Environment Variables

Example: Production Configuration

gRPC API Reference

Service Definition

Register RPC

Heartbeat RPC

Deregister RPC

HTTP API Reference (OpenAI-compatible)

POST /v1/chat/completions

POST /v1/completions

POST /v1/embeddings

GET /v1/models

GET /health

Services

Registry Service

Deployment Service

Gateway Router

Integrations

Caddy Integration

Route53 Integration

Architecture

Directory Structure

Deployment Scenarios

Development (Single Node)

Production (with Caddy)

Production (with Route53)

Logging

Log Levels

Example Output (text format)

Example Output (JSON format)

Building

Build for Current Platform

Cross-compile for Linux

Build with Version Information

System Requirements