NixOS Multi-Host Deployment Guide¶
Overview¶
This repository uses an optimized Just-based deployment system for managing 3 NixOS hosts with advanced performance optimizations, parallel operations, and smart change detection.
Quick Reference¶
Fastest Commands (Recommended)¶
just quick-test # Test all hosts in parallel (~1 min)
just quick-deploy HOST # Deploy only if changed (~30 sec)
just quick-all # Test all + deploy all (~3 min total)
All Available Deployment Commands¶
Smart Deployment (Recommended)¶
just quick-deploy p620 # Deploy P620 only if configuration changed
just quick-deploy razer # Deploy Razer only if configuration changed
just quick-deploy p510 # Deploy P510 only if configuration changed
Standard Optimized Deployment¶
just p620 # Deploy to P620 workstation (AMD/ROCm)
just razer # Deploy to Razer laptop (Intel/NVIDIA)
just p510 # Deploy to P510 media server (Intel Xeon/NVIDIA)
Advanced Deployment Options¶
just deploy-fast HOST # Fast deployment with minimal builds
just deploy-local-build HOST # Build locally, deploy remotely
just deploy-cached HOST # Deploy with binary cache optimization
just emergency-deploy HOST # Emergency deployment (skip tests)
Bulk Operations¶
just deploy-all # Deploy to all hosts sequentially (~8 min)
just deploy-all-parallel # Deploy to all hosts in parallel (~2 min)
just test-all-parallel # Test all hosts in parallel (~1 min)
just build-all-parallel # Build all configs in parallel (no deploy)
Host Configuration¶
Active Hosts¶
- P620: AMD workstation with ROCm acceleration
- Razer: Intel/NVIDIA laptop with Optimus graphics
- P510: Intel Xeon/NVIDIA media server with CUDA
- DEX5550: Intel SFF server with integrated graphics
Network Configuration¶
- Internal Network: 192.168.1.0/24
- DNS Server: DEX5550 (192.168.1.222)
- Binary Cache: P620 (http://p620:5000)
- Monitoring Server: DEX5550 (Grafana: port 3001, Prometheus: port 9090)
Performance Optimizations Applied¶
1. Nixos-rebuild Flags¶
--fast: Skip unnecessary rebuild steps--keep-going: Continue on non-critical failures--accept-flake-config: Accept flake configuration automatically- Removed
--impure: No longer needed with pure flakes
2. Parallel Operations¶
- Testing: All hosts tested simultaneously
- Building: All configurations built in parallel
- Deployment: All hosts deployed concurrently
3. Smart Change Detection¶
4. Binary Cache Integration¶
5. Network Optimizations¶
- Build host = target host (reduce network transfer)
- Local builds for slow networks
- SSH connection reuse
Performance Benchmarks¶
Testing Performance¶
| Operation | Traditional | Optimized | Improvement |
|---|---|---|---|
| Test single host | 45 seconds | 45 seconds | Same |
| Test all hosts | 4 minutes | 1 minute | 75% faster |
Deployment Performance¶
| Operation | Traditional | Optimized | Improvement |
|---|---|---|---|
| Deploy single host | 2 minutes | 45 seconds | 62% faster |
| Deploy all hosts | 8 minutes | 2 minutes | 75% faster |
| Smart deploy (no changes) | 2 minutes | 5 seconds | 95% faster |
Complete Workflows¶
| Workflow | Traditional | Optimized | Improvement |
|---|---|---|---|
| Test + Deploy single | 3 minutes | 30 seconds* | 83% faster |
| Test + Deploy all | 12 minutes | 3 minutes | 75% faster |
*When using smart deployment with no changes
Deployment Strategies by Scenario¶
Development Iteration¶
Scenario: Making frequent configuration changes during development
Production Deployment¶
Scenario: Deploying tested changes to production
Emergency Fixes¶
Scenario: Critical security patches or urgent fixes
Network Issues¶
Scenario: Deploying over slow or unreliable networks
Initial Setup¶
Scenario: First-time deployment to new hosts
CI/CD Pipeline¶
Scenario: Automated testing and deployment
# Parallel testing for speed
just test-all-parallel
# Smart deployment to skip unchanged hosts
for host in p620 razer p510; do
just quick-deploy $host
done
Troubleshooting¶
Common Issues¶
Deployment Hangs or Times Out¶
# Check host connectivity
just ping-hosts
# Try fast deployment
just deploy-fast HOST
# Use local build for network issues
just deploy-local-build HOST
Build Failures¶
# Test configuration first
just test-host HOST
# Check syntax
just check-syntax
# See what would change
just diff HOST
Slow Performance¶
# Use parallel operations
just test-all-parallel # Instead of just test-all
just deploy-all-parallel # Instead of just deploy-all
# Use smart deployment
just quick-deploy HOST # Only if changed
# Check binary cache
just deploy-cached HOST # Use P620 cache
No Changes but Still Deploying¶
# Verify with smart deployment
just quick-deploy HOST # Should skip if no changes
# Check configuration diff
just diff HOST # Shows actual differences
Performance Debugging¶
Check Build Times¶
# Benchmark specific host
just bench-host HOST 3 # Run 3 builds and average
# Check parallel efficiency
just perf-parallel # Test parallel build performance
Monitor Resource Usage¶
Advanced Features¶
Binary Cache Server¶
P620 runs a nix-serve binary cache to speed up builds across all hosts:
- URL: http://p620:5000
- Usage: Automatic in
deploy-cachedcommands - Benefits: Shared build artifacts reduce rebuild times
Tailscale VPN Integration¶
All hosts are connected via Tailscale mesh VPN:
- Exit Node: DEX5550 provides internet access
- Subnet Routing: Access to 192.168.1.0/24 network
- Security: All SSH access secured via Tailscale
Best Practices¶
1. Use Smart Deployment¶
Always prefer just quick-deploy HOST over just HOST for development.
2. Test Before Deploy¶
Run just quick-test before bulk deployments.
3. Leverage Parallelism¶
Use parallel commands whenever possible:
just test-all-paralleljust deploy-all-paralleljust build-all-parallel
4. Monitor Performance¶
Check deployment metrics in Grafana to identify bottlenecks.
5. Use Appropriate Strategy¶
Match deployment strategy to scenario:
- Development:
just quick-deploy - Production:
just quick-all - Emergency:
just emergency-deploy
Integration with Other Tools¶
Git Hooks¶
Consider adding pre-commit hooks:
CI/CD Integration¶
Example GitHub Actions workflow:
- name: Test NixOS configurations
run: just test-all-parallel
- name: Deploy if tests pass
run: just deploy-all-parallel
Monitoring Alerts¶
Grafana alerts configured for:
- Deployment failures
- Performance degradation
- Host connectivity issues
Future Optimizations¶
Planned Improvements¶
- Remote Build Caching: Distributed builds across multiple hosts
- Incremental Deployments: Deploy only changed services
- Health Checks: Automated post-deployment validation
- Rollback Automation: Automatic rollback on failure detection
Performance Targets¶
- Single host deploy: < 30 seconds
- All hosts deploy: < 90 seconds
- Smart deploy (no changes): < 5 seconds