Quick Start Guide

Get Nebula up and running in minutes.

Installation

# Build from source
go build -o nebulactl ./cmd/nebulactl
go build -o nebula-agent ./cmd/nebula-agent

# Add to PATH
export PATH=$PATH:$(pwd)

Deploy Your First Model

Option 1: Using CLI Arguments

# Deploy GPT-2 using vLLM
nebulactl deploy openai-community/gpt2 --runtime vllm --device cpu

# Deploy Mistral 7B on GPU
nebulactl deploy mistralai/Mistral-7B-Instruct --runtime vllm --gpus 1

Option 2: Using YAML Spec (Recommended)

Create a deployment spec file:

# deployment.yaml
name: my-mistral
model: mistralai/Mistral-7B-Instruct-v0.2
runtime: vllm
device: gpu

resources:
  gpus: 1

Deploy it:

nebulactl deploy -f deployment.yaml

YAML Spec Examples

Minimal Deployment

name: gpt2
model: openai-community/gpt2
runtime: vllm

Production Deployment with Configuration

name: mistral-7b-prod
model: mistralai/Mistral-7B-Instruct-v0.2
runtime: vllm
device: gpu

resources:
  gpus: 2
  gpu_type: A100-80GB

node_selector:
  provider: aws
  region: us-east-1

config:
  max_model_len: "8192"
  tensor_parallel_size: "2"

Ollama on CPU

name: phi-local
model: phi
runtime: ollama
device: cpu

Managing Nodes

Add a Local Node

# Start agent on local machine
nebula-agent --port 9091

# Register node
nebulactl node add \
  --name local-gpu \
  --host localhost \
  --port 9091

Add a Remote SSH Node

nebulactl node add \
  --name remote-gpu \
  --host 10.0.0.5 \
  --connection-type ssh \
  --ssh-user ubuntu \
  --ssh-key ~/.ssh/id_rsa

Managing Deployments

# List all deployments
nebulactl deployment list

# Get deployment details
nebulactl deployment get my-mistral

# View logs
nebulactl deployment logs my-mistral --tail 50

# Restart deployment
nebulactl deployment restart my-mistral

# Delete deployment
nebulactl deployment delete my-mistral

Supported Runtimes

Runtime	Command	Use Case
vLLM	`--runtime vllm`	High-performance inference with PagedAttention
TGI	`--runtime tgi`	Hugging Face Text Generation Inference
Ollama	`--runtime ollama`	Simple local model serving

Model Sources

Nebula supports multiple model sources:

# HuggingFace Hub (default)
model: mistralai/Mistral-7B-Instruct

# S3
model: s3://my-bucket/models/mistral-7b

# Local filesystem
model: fs:///models/custom-model

# HTTP/HTTPS
model: https://example.com/model.bin

Tips & Best Practices

Start with YAML specs - More maintainable and version-controllable
Use meaningful names - mistral-7b-prod instead of deployment-1
Pin GPU types for production - Ensures consistent performance
Test on CPU first - Faster iteration for development
Version control your specs - Commit YAML files to git

Troubleshooting

"No nodes available"

# Check if nodes are registered
nebulactl node list

# Add a node
nebulactl node add --name local --host localhost --port 9091

"No nodes with GPUs"

# Deploy on CPU instead
nebulactl deploy <model> --device cpu

"Runtime validation failed"

Make sure you're using a supported runtime: vllm, tgi, or ollama

Getting Help

# CLI help
nebulactl --help
nebulactl deploy --help
nebulactl node --help

Installation​

Deploy Your First Model​

Option 1: Using CLI Arguments​

Option 2: Using YAML Spec (Recommended)​

YAML Spec Examples​

Minimal Deployment​

Production Deployment with Configuration​

Ollama on CPU​

Managing Nodes​

Add a Local Node​

Add a Remote SSH Node​

Managing Deployments​

Supported Runtimes​

Model Sources​

Tips & Best Practices​

Troubleshooting​

"No nodes available"​

"No nodes with GPUs"​

"Runtime validation failed"​

Getting Help​

Installation

Deploy Your First Model

Option 1: Using CLI Arguments

Option 2: Using YAML Spec (Recommended)

YAML Spec Examples

Minimal Deployment

Production Deployment with Configuration

Ollama on CPU

Managing Nodes

Add a Local Node

Add a Remote SSH Node

Managing Deployments

Supported Runtimes

Model Sources

Tips & Best Practices

Troubleshooting

"No nodes available"

"No nodes with GPUs"

"Runtime validation failed"

Getting Help