Last Updated: 3/5/2026

Server Mode

While pie run is great for quick tests, production use cases need a persistent server.

Start the Server

Launch Pie in server mode:


pie serve

Output:


╭─ Pie Engine (server) ────────────────────────╮
│ Host           127.0.0.1:8080                │
│ Model          meta-llama/Llama-3.2-1B-Instruct │
│ Device         cuda:0                        │
╰──────────────────────────────────────────────╯

✓ Backend started on cuda:0
✓ Engine listening on ws://127.0.0.1:8080

The server is now ready to accept client connections.

Interactive Mode

For development and testing, use interactive mode:


pie serve -i

This gives you a shell to run inferlets directly:


Type 'help' for commands, ↑/↓ for history

pie> run text-completion --prompt "Hello world"
Hello world! How are you today?

pie> help
Available commands:
  run <inferlet> [args]  - Run an inferlet
  list                   - List running instances
  exit                   - Shutdown and exit

Monitor Mode

For real-time performance monitoring:


pie serve -m

This launches a TUI dashboard showing:

Active requests
Throughput (tokens/sec)
Memory usage
Batch statistics

Command-Line Options

Option	Description
`--config`, `-c`	Path to config file
`--host`	Override host address
`--port`	Override port
`--no-auth`	Disable authentication
`--verbose`, `-v`	Enable verbose logging
`--interactive`, `-i`	Interactive shell mode
`--monitor`, `-m`	TUI monitor mode

Examples:


# Custom port, no auth
pie serve --port 9000 --no-auth
 
# Verbose logging
pie serve -v
 
# Custom config file
pie serve -c /path/to/config.toml

Connecting Clients

Once the server is running, connect with a client:


from pie import PieClient
 
async with PieClient("ws://127.0.0.1:8080") as client:
    await client.authenticate("username")
    # ... use the client

See Client Basics for more.

Graceful Shutdown

Press Ctrl+C to shut down:


^C
Shutting down...
✓ Shutdown complete

Pie will:

Stop accepting new connections
Wait for running inferlets to complete
Terminate backends
Clean up resources

Next Steps

Learn to connect from code
Explore the CLI Reference