Skip to main content

Quick Start Guide

Get Nebula up and running in minutes.

Installation

# Build from source
go build -o nebulactl ./cmd/nebulactl
go build -o nebula-agent ./cmd/nebula-agent

# Add to PATH
export PATH=$PATH:$(pwd)

Deploy Your First Model

Option 1: Using CLI Arguments

# Deploy GPT-2 using vLLM
nebulactl deploy openai-community/gpt2 --runtime vllm --device cpu

# Deploy Mistral 7B on GPU
nebulactl deploy mistralai/Mistral-7B-Instruct --runtime vllm --gpus 1

Create a deployment spec file:

# deployment.yaml
name: my-mistral
model: mistralai/Mistral-7B-Instruct-v0.2
runtime: vllm
device: gpu

resources:
gpus: 1

Deploy it:

nebulactl deploy -f deployment.yaml

YAML Spec Examples

Minimal Deployment

name: gpt2
model: openai-community/gpt2
runtime: vllm

Production Deployment with Configuration

name: mistral-7b-prod
model: mistralai/Mistral-7B-Instruct-v0.2
runtime: vllm
device: gpu

resources:
gpus: 2
gpu_type: A100-80GB

node_selector:
provider: aws
region: us-east-1

config:
max_model_len: "8192"
tensor_parallel_size: "2"

Ollama on CPU

name: phi-local
model: phi
runtime: ollama
device: cpu

Managing Nodes

Add a Local Node

# Start agent on local machine
nebula-agent --port 9091

# Register node
nebulactl node add \
--name local-gpu \
--host localhost \
--port 9091

Add a Remote SSH Node

nebulactl node add \
--name remote-gpu \
--host 10.0.0.5 \
--connection-type ssh \
--ssh-user ubuntu \
--ssh-key ~/.ssh/id_rsa

Managing Deployments

# List all deployments
nebulactl deployment list

# Get deployment details
nebulactl deployment get my-mistral

# View logs
nebulactl deployment logs my-mistral --tail 50

# Restart deployment
nebulactl deployment restart my-mistral

# Delete deployment
nebulactl deployment delete my-mistral

Supported Runtimes

RuntimeCommandUse Case
vLLM--runtime vllmHigh-performance inference with PagedAttention
TGI--runtime tgiHugging Face Text Generation Inference
Ollama--runtime ollamaSimple local model serving

Model Sources

Nebula supports multiple model sources:

# HuggingFace Hub (default)
model: mistralai/Mistral-7B-Instruct

# S3
model: s3://my-bucket/models/mistral-7b

# Local filesystem
model: fs:///models/custom-model

# HTTP/HTTPS
model: https://example.com/model.bin

Tips & Best Practices

  1. Start with YAML specs - More maintainable and version-controllable
  2. Use meaningful names - mistral-7b-prod instead of deployment-1
  3. Pin GPU types for production - Ensures consistent performance
  4. Test on CPU first - Faster iteration for development
  5. Version control your specs - Commit YAML files to git

Troubleshooting

"No nodes available"

# Check if nodes are registered
nebulactl node list

# Add a node
nebulactl node add --name local --host localhost --port 9091

"No nodes with GPUs"

# Deploy on CPU instead
nebulactl deploy <model> --device cpu

"Runtime validation failed"

Make sure you're using a supported runtime: vllm, tgi, or ollama

Getting Help

# CLI help
nebulactl --help
nebulactl deploy --help
nebulactl node --help