Quick Start Guide
Get Nebula up and running in minutes.
Installation
# Build from source
go build -o nebulactl ./cmd/nebulactl
go build -o nebula-agent ./cmd/nebula-agent
# Add to PATH
export PATH=$PATH:$(pwd)
Deploy Your First Model
Option 1: Using CLI Arguments
# Deploy GPT-2 using vLLM
nebulactl deploy openai-community/gpt2 --runtime vllm --device cpu
# Deploy Mistral 7B on GPU
nebulactl deploy mistralai/Mistral-7B-Instruct --runtime vllm --gpus 1
Option 2: Using YAML Spec (Recommended)
Create a deployment spec file:
# deployment.yaml
name: my-mistral
model: mistralai/Mistral-7B-Instruct-v0.2
runtime: vllm
device: gpu
resources:
gpus: 1
Deploy it:
nebulactl deploy -f deployment.yaml
YAML Spec Examples
Minimal Deployment
name: gpt2
model: openai-community/gpt2
runtime: vllm
Production Deployment with Configuration
name: mistral-7b-prod
model: mistralai/Mistral-7B-Instruct-v0.2
runtime: vllm
device: gpu
resources:
gpus: 2
gpu_type: A100-80GB
node_selector:
provider: aws
region: us-east-1
config:
max_model_len: "8192"
tensor_parallel_size: "2"
Ollama on CPU
name: phi-local
model: phi
runtime: ollama
device: cpu
Managing Nodes
Add a Local Node
# Start agent on local machine
nebula-agent --port 9091
# Register node
nebulactl node add \
--name local-gpu \
--host localhost \
--port 9091
Add a Remote SSH Node
nebulactl node add \
--name remote-gpu \
--host 10.0.0.5 \
--connection-type ssh \
--ssh-user ubuntu \
--ssh-key ~/.ssh/id_rsa
Managing Deployments
# List all deployments
nebulactl deployment list
# Get deployment details
nebulactl deployment get my-mistral
# View logs
nebulactl deployment logs my-mistral --tail 50
# Restart deployment
nebulactl deployment restart my-mistral
# Delete deployment
nebulactl deployment delete my-mistral
Supported Runtimes
| Runtime | Command | Use Case |
|---|---|---|
| vLLM | --runtime vllm | High-performance inference with PagedAttention |
| TGI | --runtime tgi | Hugging Face Text Generation Inference |
| Ollama | --runtime ollama | Simple local model serving |
Model Sources
Nebula supports multiple model sources:
# HuggingFace Hub (default)
model: mistralai/Mistral-7B-Instruct
# S3
model: s3://my-bucket/models/mistral-7b
# Local filesystem
model: fs:///models/custom-model
# HTTP/HTTPS
model: https://example.com/model.bin
Tips & Best Practices
- Start with YAML specs - More maintainable and version-controllable
- Use meaningful names -
mistral-7b-prodinstead ofdeployment-1 - Pin GPU types for production - Ensures consistent performance
- Test on CPU first - Faster iteration for development
- Version control your specs - Commit YAML files to git
Troubleshooting
"No nodes available"
# Check if nodes are registered
nebulactl node list
# Add a node
nebulactl node add --name local --host localhost --port 9091
"No nodes with GPUs"
# Deploy on CPU instead
nebulactl deploy <model> --device cpu
"Runtime validation failed"
Make sure you're using a supported runtime: vllm, tgi, or ollama
Getting Help
# CLI help
nebulactl --help
nebulactl deploy --help
nebulactl node --help