Aura Workshop v1.30.1 Documentation

Welcome to the complete reference for Aura Workshop, a model-agnostic AI agent orchestration platform built for individuals and teams who want full control over their AI workflows. This documentation covers every feature from first install through advanced multi-agent teams, 31-platform automation, distributed GPU inference clusters, visual DAG workflows, remote agent deployment, the full REST API with approximately 140 endpoints, 50+ agent tools, 100+ skills, and the command-line interface. Whether you are running a quick question-and-answer session or orchestrating a fleet of agents across multiple machines, this guide has you covered.

Installation

System Requirements

Before installing, confirm your system meets the minimum requirements for your platform.

Platform	Minimum Requirements
macOS	macOS 12 (Monterey) or later. Xcode Command Line Tools are recommended for full functionality. Apple Silicon (M-series) provides Metal GPU acceleration for local inference.
Windows	Windows 10 64-bit or later. An NVIDIA GPU with current drivers is recommended for local inference with CUDA support.
Linux	A modern 64-bit distribution with webkit2gtk 4.1 and libayatana-appindicator (or their equivalents). NVIDIA drivers and CUDA toolkit are recommended for GPU inference.

Install from Package

macOS

Download the .dmg installer from the Downloads page. Open the DMG file and drag the Aura Workshop icon into your Applications folder. Because the app is distributed outside the Mac App Store, macOS may quarantine it on first launch. If you see a security warning, open Terminal and run the following command to clear the quarantine flag:

xattr -cr /Applications/Aura\ Workshop.app

After running this command, launch the app normally from Applications or Spotlight.

Windows

Download the NSIS installer .exe from the Downloads page. Run the installer and follow the standard installation wizard prompts. Choose your installation directory (the default is C:\Program Files\Aura Workshop), click Install, and wait for the process to complete. Launch Aura Workshop from the Start Menu or desktop shortcut.

Linux

Two package formats are available:

.deb package (Debian, Ubuntu, and derivatives): Install using dpkg:
```
sudo dpkg -i aura-workshop_1.30.1_amd64.deb
```
If dependency errors occur, follow up with sudo apt-get install -f to resolve them.
AppImage (any distribution): Mark the file as executable and run it directly:
```
chmod +x Aura_Workshop-1.30.1.AppImage
./Aura_Workshop-1.30.1.AppImage
```
No installation is required. The AppImage is self-contained and portable.

Docker

For headless server deployments, pull the pre-built Docker image from Docker Hub:

docker pull coolkoo/aura-workshop:daemon-latest

See the Docker Daemon section for full setup instructions including environment variables, volume mounts, and GPU passthrough.

Build from Source

Building from source requires Node.js 18 or later and a Rust toolchain installed via rustup. You also need the platform-specific Tauri dependencies documented in the Tauri v2 prerequisites guide.

# Clone the repository
git clone https://github.com/coolkoo/aura-workshop.git
cd aura-workshop

# Install JavaScript dependencies
npm install

# Build the desktop application
npm run tauri build

You can target a specific package format by passing a bundles flag:

npm run tauri build -- --bundles dmg        # macOS DMG disk image
npm run tauri build -- --bundles nsis       # Windows NSIS installer
npm run tauri build -- --bundles deb        # Linux .deb package
npm run tauri build -- --bundles appimage   # Linux AppImage

The compiled binary and installer will appear in src-tauri/target/release/bundle/.

First Launch & Setup Wizard

The very first time you open Aura Workshop, a five-step setup wizard guides you through initial configuration. Each step can be skipped and revisited later from the Settings page.

Step 1: Welcome

An introductory screen that explains the core capabilities of the platform. Click Next to proceed.

Step 2: License Key Entry (Optional)

If you have purchased a license, enter the key here to unlock Professional, Business, or Enterprise features. If you do not have a key, click Skip to continue in Community mode. Community mode provides full access to single-agent tasks, basic automation, local inference, and core tools. You can enter or upgrade your license at any time from Settings.

Step 3: Provider Configuration

Select your preferred AI provider from a list that includes Anthropic, OpenAI, Google, and many more. Enter your API key for the chosen provider. The wizard includes a Test Connection button that verifies the key is valid and the provider is reachable before saving. You can configure additional providers later from the Models page.

Step 4: Local Model Download (Optional)

If you want to run models locally without any cloud dependency, this step offers a curated list of GGUF models you can download directly. The models range from compact 5 GB options to large 42 GB models for maximum quality. Downloads show real-time progress, speed, and estimated time remaining. Skip this step if you plan to use only cloud providers.

Step 5: Ready

Setup is complete. You are taken directly to the main Dashboard where you can begin your first task.

Behind the Scenes on First Launch

In addition to the wizard, the application performs several initialization steps automatically:

Creates a local SQLite database in WAL (Write-Ahead Logging) mode with 35+ tables covering settings, conversations, tasks, teams, workflows, billing, credentials, and more.
Detects whether Docker is installed and available. If Docker is not found, the application enables native mode automatically so that agent commands run directly on your operating system.
Installs all bundled skills across document, Anthropic, superpowers, desktop apps, design, platform, and media categories.
Initializes the credential encryption system with an AES-256-GCM encryption key stored securely in the operating system keychain (macOS Keychain, Windows Credential Manager, or Linux secret service).
Starts the embedded Web UI server on port 18800.
Auto-starts any listeners that were previously enabled in a prior session.

Database Locations

Platform	Path
macOS	`~/Library/Application Support/aura-workshop/aura-workshop.db`
Windows	`%APPDATA%\aura-workshop\aura-workshop.db`
Linux	`~/.local/share/aura-workshop/aura-workshop.db`
Docker	`/data/aura-workshop.db` (via volume mount)

Application Layout

The Aura Workshop interface centers on a left sidebar with an icon rail for primary navigation, a scrollable middle section for project and task organization, and a main content area that fills the rest of the screen.

Left Sidebar Icon Rail

The icon rail runs vertically along the left edge and is always visible. The top section contains fixed navigation icons:

Icon	Label	View
Grid	Dashboard	Home screen with task history, quick actions, and notifications
Robot	Agents	Primary task workspace for all AI interactions
Headset	Listeners	Messaging platform listeners and automation triggers
Link	Webhooks	HTTP webhook endpoint management
Clock	Schedulers	Scheduled task definitions and controls

Scrollable Middle Section

Below the primary navigation icons, the sidebar has two collapsible sections:

Projects: A collapsible list of your projects. Each project shows a name and a badge with the count of associated tasks. Click the + button next to the Projects header to create a new project. Projects support drag-and-drop: drag a task from the task list onto a project name to associate them. Click the expand/collapse arrow to toggle the project list.
Tasks: A searchable, scrollable list of all tasks. Each task displays its title and a colored status dot indicating its current state (green for completed, blue for executing, red for failed, yellow for interrupted, gray for waiting, purple for planning). Use the search field above the task list to filter by title. Click any task to open it in the main content area.

Bottom Fixed Section

Three icons are pinned to the bottom of the sidebar and are always accessible regardless of scroll position:

Settings (gear icon): Opens the multi-tab Settings panel.
Report Bug: Opens a bug report dialog where you can describe issues and submit feedback.
Help / Docs: Opens documentation and help resources.

Sidebar Behavior

The sidebar is collapsible. Click the collapse button or drag the resize handle at the sidebar's right edge to adjust its width. On mobile and narrow screens, the sidebar collapses into a hamburger menu accessible from the top-left corner. Tapping the hamburger icon reveals the full sidebar as a slide-over overlay.

Dashboard Overview

The Dashboard is the home screen that greets you each time you open Aura Workshop. It provides a high-level view of your activity and offers quick pathways to start new tasks.

Greeting

At the top of the Dashboard, a randomized greeting message appears (for example, "Good afternoon" or "Welcome back"). This text rotates on each visit to keep the experience fresh.

Dependencies Banner

If the system detects that required dependencies are missing (such as Docker, Node.js, or Python), a prominent banner appears at the top of the Dashboard with a description of what is missing and a direct link to Settings where you can resolve the issue. This banner disappears once all dependencies are satisfied.

Quick Action Cards

Below the greeting, the Dashboard presents a grid of Quick Action Cards organized into eight categories. Each card contains a brief description and a one-click prompt that creates a new task pre-filled with a relevant starting instruction.

Category	Example Prompts
Software Dev	Build a REST API, set up a CI/CD pipeline, create a system design document, analyze a dataset
Marketing	Draft a content calendar, create an email campaign, run a competitive analysis
Finance	Build a financial model, create a P&L analysis, design a budget framework
Legal	Review a contract, draft an NDA, generate a compliance checklist
Research	Conduct a literature review, design an A/B test, analyze survey responses
Design	Create a design system, perform an accessibility audit, write user stories
Operations	Write a runbook, design an onboarding workflow, create an SLA framework
Education	Build a course curriculum, write a tutorial, design a grading rubric

Clicking any card immediately creates a new task and begins execution with the selected prompt.

Recent Tasks

The main body of the Dashboard displays a grid of recent tasks. Each task card shows:

Title: The first line or auto-generated summary of the task prompt.
Status badge: A color-coded label indicating the task's current state:
- EXECUTING (blue): The agent is actively working on the task.
- COMPLETED (green): The task finished successfully.
- FAILED (red): The task encountered an unrecoverable error.
- INTERRUPTED (yellow): The task was stopped by the user or by a crash and can be resumed.
- WAITING (gray): The task is queued and has not yet started execution.
- PLANNING (purple): The task is in planning mode, generating an execution plan before running.
Time ago: A relative timestamp showing when the task was created or last updated (for example, "2 hours ago").
Quick actions: Each task card has action buttons that appear on hover or tap:
- Run / Resume: Re-execute a completed task or resume an interrupted one.
- Delete: Remove the task and all its messages permanently (with confirmation).
- Fork: Create a duplicate of the task with the same context and conversation history for trying a different approach.

Completed Task Notifications

When a task finishes while you are viewing the Dashboard or another screen, a toast notification slides in from the corner. The notification shows the task title and its final status. These notifications auto-dismiss after 8 seconds, or you can click them to navigate directly to the completed task.

Task Composer

At the bottom of the Dashboard (and indeed at the bottom of every screen), a persistent task composer bar is always available. This is the single entry point for all interactions with the AI. The composer includes the following controls:

Textarea: The main text input where you type your task description, question, or follow-up message. Press Enter to send or Shift+Enter for a new line.
Files button: Opens a multi-select file picker allowing you to attach documents, images, or other files as context for the task. Supported image formats (JPEG, PNG, GIF, WebP) are sent as vision inputs to multimodal models. Other file types are read as text.
Folder button: Opens a directory picker to mount a project folder. The selected folder becomes the agent's working directory, and file operations are scoped to that path.
Tools button: Opens the MCP tool picker, which displays all available tools grouped by their MCP server. Each tool shows its name and a brief description. Toggle individual tools on or off to control which capabilities the agent can use for this task.
Voice button: Activates the microphone for speech-to-text input. When recording, a pulsing red dot appears on the button. Speak your task description and the audio is transcribed and inserted into the textarea automatically.
Plan/Execute toggle: A clipboard icon that toggles between Plan mode and Execute mode. When Plan mode is active (the icon turns cyan), the agent generates a detailed execution plan before taking any action. When Plan mode is off, the agent proceeds directly to execution.
Role picker: A dropdown that lets you assign a specific role to the task. The picker includes a search field to filter roles, a category filter, and a "Default (no role)" option at the top. Selecting a role applies that role's system prompt and tool permissions to the task.
Project selector: A dropdown listing all your projects. Selecting a project associates the new task with that project and ensures project-scoped memory is available.
Thinking mode toggle: When the selected model supports extended reasoning, this control appears with four levels: Off, Low, Medium, and High. Higher thinking levels cause the model to reason more deeply before responding, producing more thorough but slower output.
Model selector: A dropdown listing all configured and available models, grouped by provider. Select a model to use for the current task. The selected model persists as the default for subsequent tasks.
Send button: A paper plane icon that submits the task. Alternatively, press Enter when the textarea is focused.

Starting a Task

There is no separate "Chat Mode" or "Agent Mode" in Aura Workshop. Everything flows through the single unified input described above. To start a task:

Type your request in the input textarea at the bottom of the screen.
Optionally click Files to attach one or more files for context.
Optionally click Folder to mount a project directory as the agent's working path.
Optionally click Tools to open the MCP tool picker and enable or disable specific tools. Tools are grouped by their MCP server name, and each shows the tool name plus a brief description.
Optionally click Voice to record a spoken prompt. A pulsing red dot indicates active recording.
Optionally toggle Plan/Execute (clipboard icon, cyan when Plan is on) to have the agent create an execution plan before acting.
Optionally select a Role from the role picker. The picker includes search, category filtering, and a "Default (no role)" option.
Optionally set Thinking mode (Off / Low / Medium / High) when the selected model supports it.
Optionally choose a different model from the model selector dropdown.
Press Enter or click the Send button (paper plane icon).

The system automatically classifies your input and routes it to the appropriate execution path. You never need to manually select a mode.

Task Header Bar

When you are inside an active or completed task, a header bar appears at the top of the content area. It contains the following elements:

Back button: An arrow-left icon with the label "Back" that returns you to the Dashboard or task list.
Task title: The title or first-line summary of the task prompt. For long titles, the text is truncated with an ellipsis.
Status badge: A color-coded label showing the current task state:
- COMPLETED (green)
- FAILED (red)
- INTERRUPTED (yellow)
- EXECUTING (blue)
- PLANNING (purple)
- WAITING (gray)
IM platform badge: For tasks that originated from a listener (Slack, Discord, Telegram, WhatsApp, or another platform), a badge appears showing the platform icon, platform name, and channel or conversation name.
Timer: While the task is executing, a running timer displays the elapsed time in minutes and seconds.
Action buttons (visible when the task is not actively running):
- Fork: Creates a copy of the task with the same context and full conversation history. Use this to explore alternative approaches without losing the original.
- Export: Exports the complete task conversation as a styled PDF document.
- Show/Hide: Toggles the visibility of tool call details within the message thread. When hidden, only the tool name and status are shown without the full input/output.
- Context panel toggle (hamburger icon): Opens or closes the right sidebar Context Panel, which shows workflow progress, parallel agent status, thinking content, and usage statistics.

Message Thread

The main area of the task workspace is a scrollable message thread that displays the entire conversation between you and the agent.

User Messages

Your messages appear as right-aligned bubbles. Any file or folder attachments you included are displayed as chips below the message text. File chips show the file name with a small icon, and folder chips show the directory path. Both types have an X button to remove them from context for subsequent messages.

Assistant Messages

Agent responses appear as left-aligned content blocks. During streaming, text arrives progressively with an activity indicator showing the agent's current operation. Responses are rendered as rich Markdown with syntax-highlighted code blocks, LaTeX math (via KaTeX), Mermaid diagrams, tables, and all standard formatting.

Thinking Blocks

When the model uses extended reasoning (thinking mode), thinking blocks appear as collapsible sections prefixed with a thought-bubble indicator. Click to expand and view the agent's internal reasoning chain. By default, thinking blocks are collapsed to reduce visual noise.

Tool Execution Cards

Each tool invocation produces a collapsible card in the message thread. The card header shows:

The tool name (for example, "bash", "read_file", "web_search").
An icon representing the tool type.
A status indicator: a spinning animation during execution, "OK" (green) on success, or "ERR" (red) on failure.

Expanding the card reveals the full input parameters the agent sent to the tool and the complete output the tool returned. For long outputs, the content is scrollable within the card.

Role Dividers

In multi-agent team tasks, a horizontal divider line appears when execution passes from one role to the next. The divider shows the role name and includes a + Memory button that lets you save the agent's work from that role as a persistent memory entry.

Routing Info Badges

When smart routing is active, small badges appear on messages indicating the routing decision, such as "via claude-sonnet-4-20250514 tier" or "routed: Standard". This helps you understand which model was selected for each turn.

Error Messages

Errors from tool execution, API failures, or provider issues appear as red-text blocks within the thread. Error messages include the error type and a description to help you diagnose and resolve the issue.

Load Earlier Button

For tasks with very long histories that exceed the display window, a "Load Earlier" button appears at the top of the thread. Click it to fetch and display older messages from the database.

Activity Indicator

While a task is executing, a fixed bar appears just above the input area. This activity indicator shows:

A spinner animation confirming that work is in progress.
Live operation text describing exactly what the agent is doing. Examples:
- The name of the tool currently being executed and its progress.
- A preview of the agent's reasoning when generating text.
- Text generation progress (for example, token count or word count).
- For parallel agents: a count of files created, such as "8 agents generating files... 4 file(s) created by agents".

The activity indicator updates in real-time as the agent moves between operations.

Follow-up Suggestions

After a task completes, a collapsible suggestion bar appears below the last message. It features:

A question mark icon with a count badge showing how many suggestions are available.
Clickable suggestion chips that represent contextually relevant follow-up questions or next steps.
Clicking a chip inserts its text into the input textarea, ready for you to modify or send immediately.

The suggestions are generated automatically based on the conversation context and the agent's output.

Response Actions

When a task reaches a terminal state (completed, failed, or interrupted), several response actions become available:

Copy Conversation: Copies the entire conversation thread as formatted text to your clipboard.
Per-code-block Copy: Each code block in the response includes a small Copy button in its top-right corner that copies just that code snippet.
Per-diagram Save as PNG: Mermaid diagrams rendered in the thread include a "Save as PNG" button that exports the diagram as an image file.
Per-table Copy as CSV: Tables rendered in the response include a button that copies the table data as comma-separated values.

Scroll to Bottom Button

When you scroll up in a long message thread, a circular down-arrow button appears in the bottom-right corner. Click it to instantly scroll to the most recent message.

Files Created Button

When the agent creates files during task execution, a "Files Created" button appears in the task header area. It displays a count badge (for example, "(7)") showing how many files were produced. Clicking it opens the FileBrowser overlay, a file explorer view that lists all generated files with options to view, download, or open them.

Right Sidebar / Context Panel

The Context Panel is a collapsible right sidebar that provides detailed information about the current task. Toggle it with the hamburger icon in the task header. The panel can be resized by dragging its left edge.

Workflow Progress

For multi-agent team tasks, the Context Panel shows a "Workflow" header with a progress badge (for example, "3/6"). Below it, a vertical stepper displays each agent or role in the workflow with status dots:

Pending (gray dot): The role has not yet started.
Active (pulsing blue dot): The role is currently executing.
Done (green dot): The role has completed its work.

Roles running on remote deployments show a "Remote" badge next to their name.

Parallel Agents

When three or more agents are running in parallel (via fan-out), the Context Panel shows a dedicated Parallel Agents section. It displays a count such as "5/8 completed" and lists a card for each agent with its index number, current status, and the number of tool calls it has made.

Thinking Content

When the model is using extended reasoning, the Context Panel includes a collapsible Thinking Content section that shows the raw reasoning text. A maximize/minimize toggle lets you expand it to fill the panel for easier reading.

Usage Stats

The bottom of the Context Panel displays real-time usage statistics for the current task:

Metric	Description
Context Usage	A color-coded progress bar showing what percentage of the model's context window has been consumed. The bar transitions from green to yellow at 50% and to red at 80% or above.
Input Tokens	The number of input tokens sent to the model, formatted with K/M suffixes for readability (for example, "12.3K").
Output Tokens	The number of output tokens generated by the model.
Cache Read	The number of tokens served from the provider's prompt cache, if available.
Total Cost	The cumulative dollar cost for this task, calculated using the model pricing table.
Latency	The time-to-first-token latency in milliseconds for the most recent API call.
Model	The name of the model being used for this task.
Provider	The name of the provider serving the model.

Visualizations in the Message Thread

The message thread renders rich content beyond plain text:

Mermaid Diagrams

The agent can output Mermaid code blocks that render as interactive SVG diagrams directly in the chat. Supported diagram types include flowcharts, sequence diagrams, class diagrams, pie charts, timelines, Gantt charts, ER diagrams, state diagrams, and journey maps. Each diagram includes a "Save as PNG" button for export.

Charts

The chart_generate tool produces PNG chart images that are embedded in the message thread. Six chart types are supported: bar (vertical), line, pie, scatter, histogram, and area. All charts use a dark theme that matches the application UI.

Code Blocks

Code blocks in responses are syntax-highlighted using language detection. Each code block has a copy button in its top-right corner. Supported languages include Python, JavaScript, TypeScript, Rust, Go, Java, C++, HTML, CSS, SQL, YAML, JSON, Bash, and many more.

Math

Mathematical expressions are rendered using KaTeX. Inline math uses $...$ delimiters and display math uses $$...$$ blocks.

Tables

Markdown tables are rendered as styled HTML tables with alternating row colors. Each table includes a "Copy as CSV" button for easy export to spreadsheets.

Exporting Conversations

Click Export in the task header bar on any completed task. The conversation opens as a styled HTML page in a new window. Use your browser's Print > Save as PDF to create a shareable document. The exported page includes all text, code blocks, tool call summaries, and images.

Running Multiple Tasks Simultaneously

Multiple tasks can run at the same time, each potentially using a different model. Each task gets its own independent cancel token and event stream. The model that was active when a task was launched is captured for that task's duration, so switching models does not affect already-running tasks. Monitor all running tasks from the Dashboard where their status updates in real-time.

Project Management

Projects let you organize related tasks into groups, scope memory to specific codebases, and keep your workspace tidy.

Creating a Project

Click the + button next to the "Projects" header in the left sidebar. A dialog appears where you enter a project name and optionally select a directory path on disk. The directory path enables project-scoped memory and AURA.md discovery.

Associating Tasks with Projects

Drag and drop any task from the task list onto a project name in the sidebar to associate it. You can also select a project from the project selector in the task composer before creating a new task.

Task Count Badges

Each project in the sidebar displays a badge with the count of tasks associated with it. This count updates in real-time as tasks are created, completed, or deleted.

Project Actions

Rename (pencil icon): Click the pencil icon next to a project name to rename it.
Delete: Right-click or use the context menu to delete a project. Deleting a project does not delete the tasks associated with it; they become unassociated.

Multi-Select and Bulk Actions

Hold Ctrl (or Cmd on macOS) and click to select multiple tasks. Checkboxes appear for each task when in multi-select mode. A bulk action bar appears at the top of the task list with options to move selected tasks to a project, delete them, or resume interrupted tasks in batch.

Task Classification & Smart Routing

Every message you send is automatically analyzed by a fast one-shot LLM call that classifies it into a category. This classification happens transparently and requires no manual mode switching.

Classification Categories

Category	What Happens
Single Agent Task	A single agent runs with access to all configured tools. This is the default for most requests that one agent can handle end-to-end.
Team Task	The system routes to a multi-agent team. If a matching team already exists, it is selected automatically. If not, the system creates a new team with appropriate roles on the fly.
Scheduled Task	If the message describes a recurring task (for example, "every Monday at 9am"), the system creates a schedule automatically.
Clarification Needed	The agent asks follow-up questions to gather more information before proceeding.

Smart Routing

When smart routing is enabled (in Settings), the classifier also evaluates the complexity of your request and routes it to the most cost-effective model tier that can handle it. This saves money on simple requests while preserving quality for complex ones. The routing tiers are configurable (see Smart Routing under Models).

Multi-Agent Teams

Teams let you assign complex tasks to a group of specialized agents that work together, each contributing their domain expertise. Teams are configured in Settings, and tasks are routed to teams automatically by the classifier or manually by selecting a team.

Team Configuration

Go to Settings > Teams to manage teams. The team form includes:

Team name: A descriptive label for the team.
Member list: Add roles from the Role Library or create new custom roles. Each member can be added or removed with a single click.
Role assignments: Each member has a specific role with its own system prompt, tool permissions, and optionally a dedicated model.
Workflow type: Choose how the team members collaborate:
- Sequence: Roles execute one after another. Each role receives the output from the previous role as context.
- Parallel: All roles execute simultaneously. Results are merged at the end.
- Fanout/Merge: A source role produces a list of sub-tasks, which are distributed to parallel worker roles, then a final role merges the results.

Team Task UI

When a team task runs, the message thread shows role dividers between each agent's section. The Context Panel displays the workflow progress stepper showing which role is active. Role handoffs are visible in the thread: each role calls the role_complete tool to pass structured handoff data (summary, files created, key decisions, constraints) to the next role.

Built-in Roles

Aura Workshop ships with 20 built-in roles organized into six categories. Each role has a pre-written system prompt and a curated set of tool permissions.

Role Library (20 Built-in Roles)

Category	Roles
Software	Product Manager, Architect, Developer, QA Engineer, DevOps
Content	Research Lead, Writer, Editor
Business	Business Analyst, Marketing Strategist, Sales Copywriter
Data	Data Engineer, Data Analyst, Data Scientist
Design	UX Designer, UI Designer
Operations	Project Coordinator, Technical Writer, Security Auditor, Code Reviewer

Per-Role Model Selection

Each role in a team can be assigned a specific model. This allows you to use a smaller, cheaper model for routine roles (such as initial research) while reserving a more powerful model for critical roles (such as architecture decisions). If no per-role model is set, the team uses the globally selected model.

Custom Roles

Create, edit, duplicate, or delete roles from Settings > Roles & Prompts. Each role has a name, description, system prompt, and a set of allowed tools (toggled via checkboxes). Roles are saved as markdown files with YAML frontmatter in ~/.aura/roles/.

Fan-Out (Parallel Agents)

When the classifier detects that a task contains independent sub-tasks, it uses fan-out to run multiple agents in parallel. For example, "Compare the top 5 vector databases" spawns five parallel research agents, one per database, then merges their results into a unified comparison.

Team Type	Source Role Produces	Fan-Out Role Does
Software Dev	Architect lists implementation tasks	One developer per task
Content Writing	Research Lead lists sections	One writer per section
Research	Lead lists research questions	One researcher per question
Translation	Manager lists target languages	One translator per language

Parallel results are merged using configurable strategies: CopyOnWrite (safe, no data loss), LastWins (last agent's version prevails on conflict), or LLMResolve (an LLM intelligently combines conflicting changes).

Agent Tools Reference

Every agent has access to a rich set of built-in tools. The available tools depend on role configuration, execution mode, and any connected MCP servers. The full tool registry is defined in tools/mod.rs.

Core Tools (Always Available)

Tool	Description
`read_file`	Read the contents of a file at a given path. Returns the full text of the file. Supports all text-based formats and can handle binary files by returning base64-encoded content.
`write_file`	Create a new file or overwrite an existing file with the specified content. Automatically creates parent directories if they do not exist.
`edit_file`	Make targeted edits to an existing file using find-and-replace operations. Supports multiple replacements in a single call, making it efficient for surgical modifications without rewriting entire files.
`bash`	Execute shell commands on the host operating system. Uses `sh -c` on macOS and Linux or `cmd /C` on Windows. Supports setting the working directory, timeout, and environment variables. Subject to safety guardrails that block destructive commands.
`glob`	Find files matching a glob pattern such as `*/.ts` or `src/*/.py`. Returns a list of matching file paths, useful for discovering project structure.
`grep`	Search file contents using regular expressions. Returns matching lines with file paths and line numbers, similar to the `grep` command but with structured output for agent consumption.
`list_dir`	List the contents of a directory, returning file names, sizes, types (file or directory), and modification times.
`web_fetch`	Fetch content from a URL via HTTP GET. Returns the response body as text. Useful for reading web pages, downloading data, and interacting with web APIs.
`web_search`	Search the web using the configured search provider (DuckDuckGo by default, or Google, Brave, Serper, Bing). Returns text results and image URLs.
`email_send`	Send an email using the configured email method (system default or SMTP). Supports HTML content, subject line, recipients (to, cc, bcc), and attachments.
`chart_generate`	Generate data visualizations as PNG images. Supports bar, line, pie, scatter, histogram, and area chart types with a dark theme. Ideal for creating reports and visual summaries.
`generate_image`	Generate images using AI image generation APIs. The agent provides a text prompt and receives a generated image.
`generate_video`	Generate video content using AI video generation APIs. Supports various durations and styles depending on the configured provider.
`generate_music`	Generate audio and music using AI music generation APIs. Supports various genres, moods, and durations.
`run_chain`	Execute a chain script, which is a predefined sequence of tool calls defined in a skill file. This meta-tool enables complex multi-step workflows to be packaged as reusable recipes.

Docker Tools

Tool	Description
`docker_run`	Run a command inside a Docker container. Supports image selection, volume mounts, port mapping, and environment variables. Useful for running tasks in isolated environments.
`docker_list`	List all running Docker containers on the host, showing container ID, image, status, and port mappings.
`docker_images`	List all Docker images available on the host, showing repository, tag, image ID, and size.

Platform Tools

Tool	Description
`system_notify`	Send a native desktop notification with a title and message body. Works on macOS, Windows, and Linux.
`screen_capture`	Capture a screenshot of the current display. On macOS, uses the built-in screencapture utility. Returns the image as a file path or base64 data.
`camera_capture`	Capture an image from the device's webcam. Currently supported on macOS. Returns the captured image for analysis or inclusion in responses.
`create_listener`	Programmatically create a new event listener for any of the 31 supported messaging platforms. The agent can set up real-time monitoring without manual configuration.
`create_schedule`	Programmatically create a new scheduled task with a specified time, frequency, and prompt.
`create_webhook`	Programmatically create a new webhook endpoint that triggers agent tasks on incoming HTTP requests.
`send_slack`	Send a message to a Slack channel or user. Requires a Slack bot token.
`send_email`	Send an email message (distinct from `email_send` in that this tool is specifically for platform-level messaging).
`ocr`	Extract text from images using optical character recognition. Combines a vision model with text extraction for high-accuracy results on documents, screenshots, and photos.
`merge_pdfs`	Merge multiple PDF files into a single document. Accepts a list of file paths and produces a combined PDF.
`security_scan`	Run a security vulnerability scan on code or configurations. Reports potential issues with severity levels and remediation suggestions.
`role_complete`	Signal that the current role has finished its work in a multi-agent team workflow. Includes structured handoff data: summary, files created, key decisions, and constraints for the next role. This tool is only available within team workflow contexts.

MCP Tools

Any tools provided by connected MCP servers are automatically available to agents. MCP tools are named using the pattern mcp_{server_id}_{tool_name}, making them easy to identify. MCP tool results are automatically truncated to 8000 characters to prevent context overflow. MCP tools appear in the Tools picker in the task composer, grouped by their server name.

Command Safety

The agent middleware includes non-bypassable safety guardrails that block dangerous shell commands. These guardrails are always active regardless of settings.

Blocked Patterns

Recursive deletion of the root filesystem: rm -rf /
Fork bombs: :(){ :|:& };:
Disk formatting: mkfs, dd targeting block devices
Other destructive patterns targeting system-critical paths or operations

Command Timeout

Shell commands have a default timeout of 60 seconds, with a maximum configurable timeout of 300 seconds. Commands exceeding the timeout are terminated automatically.

Elevated Bash

The elevatedBash setting in General Settings allows agent commands to use sudo. This does not bypass the safety guardrails; dangerous patterns are still blocked even with elevated privileges.

Concurrency

The max_concurrent_tools setting (default: 4) controls how many tools can execute simultaneously within a single task. This prevents resource exhaustion when the agent attempts many parallel tool calls.

Skills Overview

Skills are specialized instruction sets and automation recipes that extend the agent's capabilities. Aura Workshop ships with a comprehensive skills library organized into multiple categories. Skills are managed in Settings > Skills, where you can view the list of installed skills (each showing name, category, and a prompt preview), edit existing skills, delete skills, or create new ones with the New Skill button.

Document Skills

Core document generation skills are bundled with every installation. These enable the agent to create professional documents programmatically:

Skill	Description
`pdf`	Create PDF reports with headers, paragraphs, tables, charts, images, and custom styling.
`docx`	Create Microsoft Word documents using python-docx with full formatting, styles, headers, footers, and tables.
`xlsx`	Create Excel spreadsheets using openpyxl with formulas, charts, conditional formatting, and multiple sheets.
`pptx`	Create PowerPoint presentations using python-pptx with slide layouts, charts, images, and transitions.

Anthropic Skills (15)

Advanced skills covering design, development, content creation, and tooling:

Skill	Description
`pdf`	Advanced PDF generation with complex layouts
`docx`	Advanced Word document workflows
`xlsx`	Advanced spreadsheet operations
`pptx`	Advanced presentation creation
`algorithmic-art`	Generate algorithmic and generative art using code
`brand-guidelines`	Create comprehensive brand guideline documents with color palettes, typography, and usage rules
`canvas-design`	Design interactive canvas-based visualizations
`doc-coauthoring`	Collaborative document co-authoring workflows
`frontend-design`	Design and build frontend interfaces with modern frameworks
`internal-comms`	Draft internal communications, memos, and announcements
`mcp-builder`	Build custom MCP servers from scratch
`skill-creator`	Create new skills with proper structure and metadata
`slack-gif-creator`	Create animated GIFs optimized for Slack
`theme-factory`	Design and generate UI themes with consistent design tokens
`web-artifacts-builder`	Build interactive web artifacts (mini-apps, widgets, demos)
`webapp-testing`	Test web applications end-to-end with automated test suites

Superpowers (14)

Meta-skills that enhance how the agent approaches complex tasks. These do not perform actions themselves but guide the agent's reasoning and workflow strategy:

Skill	Description
`brainstorming`	Structured brainstorming and ideation frameworks
`dispatching-parallel-agents`	Coordinate multiple agents working in parallel on independent sub-tasks
`executing-plans`	Execute multi-step plans methodically, tracking progress and adapting to issues
`finishing-a-development-branch`	Complete and polish a feature branch: tests, linting, commit messages, PR preparation
`receiving-code-review`	Process and apply code review feedback systematically
`requesting-code-review`	Prepare and submit code for review with clear context and description
`subagent-driven-development`	Break complex tasks into sub-agent work units for parallel execution
`systematic-debugging`	Methodical debugging with hypothesis generation, testing, and root cause analysis
`test-driven-development`	Write tests first, then implement code to pass those tests
`using-git-worktrees`	Manage parallel git worktrees for concurrent feature development
`using-superpowers`	Meta-skill for combining multiple superpowers in a single workflow
`verification-before-completion`	Verify all work meets requirements before marking a task as complete
`writing-plans`	Create detailed, structured execution plans before beginning work
`writing-skills`	Author new skill definitions with proper structure and documentation

Desktop App Skills (20)

Using the Accessibility API on macOS, the agent can interact with native desktop applications directly. Each application has a dedicated skill covering its specific UI elements, menus, and workflows:

Application	Capabilities
Excel	Create/edit spreadsheets, formulas, formatting, charts
Word	Create/edit documents, styles, tables, images
PowerPoint	Create/edit presentations, slides, animations
Chrome	Navigate pages, interact with web content, capture screenshots
Finder	Navigate folders, manage files, organize documents
Outlook	Compose/read emails, manage calendar, contacts
Numbers	Create/edit Apple Numbers spreadsheets
Pages	Create/edit Apple Pages documents
Keynote	Create/edit Apple Keynote presentations
Slack	Send messages, navigate channels, manage workspace
Teams	Send messages, join meetings, manage teams
Zoom	Start/join meetings, manage settings
Mail	Compose/read emails in Apple Mail
Notion	Create/edit pages, databases, and blocks
Terminal	Execute commands, manage terminal sessions
VS Code	Edit files, navigate projects, run tasks
Figma	Create/edit designs, manage components
Acrobat	View/edit PDFs, annotations, form filling
Salesforce	Navigate records, create/edit objects, run reports
SAP	Navigate transactions, enter data, run reports

Design Skills (5)

Skill	Description
`taste-skill`	Evaluate and refine visual design quality and taste
`impeccable`	Produce pixel-perfect, polished design output
`ui-ux-pro-max`	Advanced UI/UX design guidance and best practices
`design-audit`	Audit designs for consistency, accessibility, and best practices
`typography`	Typography selection, pairing, and hierarchy guidance

Platform Skills

Skill	Description
`credential-store`	Manage encrypted credentials programmatically within agent workflows
`document-analyzer`	Analyze documents and extract structured information
`browser-automation`	Automate browser interactions via Playwright for testing and scraping
`browser-act`	Direct browser action control for real-time web interaction
`browser-act-skill-forge`	Create new browser automation skills from recorded actions
`prd`	Generate comprehensive product requirements documents
`orchestration`	Multi-agent orchestration patterns and coordination strategies
`github`	GitHub repository operations: PRs, issues, reviews, workflows
`weasyprint`	Generate high-fidelity PDFs from HTML using the WeasyPrint engine
`excalidraw`	Create Excalidraw diagrams and sketches programmatically
`pandoc`	Document format conversion between Markdown, HTML, DOCX, PDF, and more

Media and Research Skills (Printing Press Collection)

Over 100 API tools for media generation, research, and data enrichment are available through the Printing Press skill collection. These cover a wide range of capabilities:

Category	Examples
Image Generation	DALL-E, Stable Diffusion, Midjourney-compatible APIs, flux
Video Creation	Video generation from text prompts, video editing APIs
Music Synthesis	AI music generation, sound effects, audio processing
Web Scraping	Structured web data extraction, site crawling
Data Analysis	Statistical analysis tools, data transformation, visualization
Social Media	Twitter/X, LinkedIn, Instagram APIs for posting and analytics
Financial Data	Stock prices, market data, financial news, crypto markets
Weather	Current weather, forecasts, historical weather data
Translation	Multi-language translation APIs
Communication	SMS, push notifications, messaging APIs

Role Skills

Role skills are specialized instruction sets that can be attached to specific roles. They extend a role's capabilities with domain-specific knowledge and behavioral guidelines. Managed via Settings > Skills, the REST API (/api/role-skills), or the agent's own tools (platform_create_skill, platform_list_skills). Each role skill contains:

A name and category for organization
A detailed prompt that provides domain expertise
Optional chain scripts for automated multi-step workflows
Metadata for search and discovery

Creating Custom Skills

You can create new skills in two ways:

Through the UI: Go to Settings > Skills, click "New Skill", fill in the name, category, description, and prompt template.
Through the skill-creator skill: Ask the agent to "create a skill for [your use case]" and it will use the skill-creator meta-skill to generate a properly structured skill definition.

Skills can include chain scripts, which are sequences of tool calls that run in order. The run_chain tool executes these scripts, enabling complex multi-step workflows to be packaged as reusable one-click recipes.

MCP Servers

Model Context Protocol (MCP) servers extend the agent's tool capabilities by connecting to external services and data sources.

Transport Types

Transport	Description
stdio	Launches a local process and communicates via stdin/stdout. Used for most MCP servers that run as command-line programs (Node.js, Python, etc.).
HTTP	Connects to a remote MCP server via HTTP with Server-Sent Events for streaming. Used for hosted or cloud-based MCP services.

Configuration (Settings > MCPs)

Go to Settings > MCPs.
Click Add MCP Server.
Enter a descriptive name for the server.
Select the transport type (stdio or HTTP).
For stdio: enter the command to launch the server (for example, npx or python) and its arguments. You can also set environment variables that will be passed to the child process.
For HTTP: enter the server URL. Optionally configure OAuth credentials (client_id and client_secret) or custom headers (Bearer token, API key, or other authentication headers).
Select the isolation mode:
- shared: A single connection is used across all tasks. This is efficient but means state persists between tasks.
- per_task: Each task gets its own MCP server instance, providing complete state isolation between tasks.
Click Connect to establish the connection. A status indicator shows whether the server is connected (green) or disconnected (red).

Tool Definition Caching

When an MCP server connects, Aura Workshop caches its tool definitions so that they appear instantly in the tool picker without re-querying the server on each task.

Import from JSON

You can import MCP server configurations from a JSON file. This is useful for sharing configurations across team members or machines. The import format follows the standard MCP configuration schema.

Auto-Seeded MCP Servers

Aura Workshop automatically configures certain MCP servers based on your installed tools and API keys:

Server	Trigger	Description
Playwright Browser	Always available	Browser automation: navigate, click, type, screenshot, extract elements
Z.AI Vision	z.ai API key configured	8 vision tools for image analysis, OCR, and visual understanding
Z.AI Zread	z.ai API key configured	Read and extract content from GitHub repos and documentation
MiniMax Coding Plan	MiniMax API key configured	Web search and image understanding for structured coding plans

Schedules

Schedules let you define tasks that run automatically at specified times. Access the Schedules tab from the Schedulers icon in the left sidebar navigation.

Schedule List

The list view shows all configured schedules. Each row displays:

Title of the schedule
A preview of the prompt text
Schedule type (daily, weekly, cron, or once)
Next run time
An enabled/disabled toggle switch
Edit and Delete action buttons

Create / Edit Schedule Form

Title: A descriptive name for the schedule.
Prompt textarea: The task description that will be sent to the agent on each execution.
Schedule type selector:
- Once: Runs a single time at the specified date and time.
- Daily: Runs every day at the specified time (HH:MM picker).
- Weekly: Runs on selected days of the week at the specified time. A day selector lets you check one or more days (Mon through Sun).
- Cron: Runs on a custom cron expression for advanced scheduling.
Time (HH:MM): The time of day to execute, in 24-hour format.
Date (for "once" type): A date picker for one-time schedules.
Duration type:
- once: Execute a single time and then deactivate.
- repeat_until: Repeat until a specified end date.
- forever: Repeat indefinitely until manually disabled.
Project path: Optionally associate the schedule with a project directory.
Model selector: Choose a specific model for scheduled task execution.
Role selector: Assign a role to the scheduled task.
Target agent selector: Route the scheduled task to a specific remote agent deployment.

Listeners (31 Platforms)

Listeners monitor external messaging platforms and trigger agent tasks based on incoming messages and events. Access the Listeners tab from the Listeners icon in the left sidebar.

Listener List

Each row in the list shows:

Platform icon and name
Workspace or channel name
Status indicator (online with green dot, offline with gray dot)
Last received message timestamp
Enabled/disabled toggle
Action buttons: View Logs, Edit, Delete

Supported Platforms (31)

Category	Platforms
Messaging	WhatsApp, Telegram, Discord, Slack, Signal, Matrix, IRC, XMPP, Microsoft Teams, LINE, Facebook Messenger, WeChat, iMessage (macOS), Google Chat, Feishu/Lark
Email	Email (IMAP/SMTP), Gmail (API)
Social	Twitter/X, Mastodon, Bluesky, Reddit, Twitch, Nostr, Zalo
Collaboration	Zulip, Rocket.Chat, Mattermost, Nextcloud Talk, Synology Chat
Built-in	Webchat, Chatbot widget

Create / Edit Listener Form

Platform selector: Choose from the 31 supported platforms.
Token input: Enter the authentication token or credentials for the selected platform.
Trigger type:
- Mentions: Only respond when the bot is mentioned by name or tag.
- Keywords: Only respond to messages containing specific keywords.
- All: Respond to every incoming message.
Keywords: A comma-separated list of trigger keywords (when using keyword trigger type).
Custom prompt: Define what the agent should do when a message matches the trigger rules.
Model selector: Choose a specific model for listener-triggered tasks.

Listener Detail View

Clicking "View Logs" on a listener opens the detail view, which shows:

An event log table with columns: timestamp, incoming message, sender/user, agent response
Memory association settings for the listener
Connection status and diagnostics

Rules Engine

Each listener has a configurable rules engine that filters incoming messages:

Sender blocklists: Ignore messages from specific senders or bots.
Channel blocklists: Ignore messages from specific channels.
Keyword triggers: Only process messages containing specific words or phrases.
Mention requirements: Only process messages that mention the bot.
Allowlists: Only process messages from specific approved senders.

Webhooks

Webhooks let you receive HTTP requests from external services and trigger agent tasks. Access the Webhooks tab from the Webhooks icon in the sidebar.

Webhook List

Each row shows:

Webhook name
Auto-generated endpoint URL (with a copy button)
Last received timestamp
Enabled/disabled toggle
Action buttons: View Logs, Edit, Delete

Create / Edit Webhook Form

Name: A descriptive name for the webhook.
Custom prompt: A prompt template that defines what the agent should do when the webhook is triggered. Use {{payload}} to insert the request body.
HMAC-SHA256 secret: An optional shared secret for request validation. A visibility toggle lets you show/hide the secret value.
Model selector: Choose a specific model for webhook-triggered tasks.
Role selector: Assign a role for the triggered task.
Target agent: Route to a specific remote deployment.
Project path: Associate with a project directory.

Webhook Detail View

Shows the auto-generated webhook URL with a copy button, a table of recent invocation logs (timestamp, request body, response status), and a cURL example for testing.

Slash Commands

Custom slash commands let you define shortcut commands that trigger specific agent behaviors.

Slash Command List

Each row shows the command name (with / prefix), description, handler type, enabled/disabled toggle, and Edit/Delete buttons.

Create / Edit Slash Command Form

Command name: The name of the command (without the / prefix).
Description: A brief description shown when listing available commands.
Prompt template: The prompt that is sent to the agent when the command is invoked.
Handler type: How the command is executed (agent_task, webhook, etc.).

Workflows (DAG-Based Automation)

Workflows provide a visual, directed acyclic graph (DAG) based automation system for building complex multi-step pipelines. Available in Business and Enterprise tiers.

Visual Workflow Editor

The workflow editor is a canvas-based interface where you create and connect nodes to define execution flow. You can:

Add nodes by clicking the "Add Node" button and selecting a node type.
Connect nodes by drawing edges between ports.
Configure each node by clicking it to open the node configuration sidebar.
Delete nodes and edges with the Delete key or right-click context menu.
Pan and zoom the canvas for large workflows.

11 Node Types

Node Type	Description
agent_task	Run a single agent with a prompt and full tool access. Configure the prompt, model, role, and project path.
conditional	Branch execution based on conditions evaluated against data from previous nodes. Supports equality, comparison, and pattern matching.
delay	Pause execution for a specified duration (seconds, minutes, hours).
fan_out	Split work across multiple parallel agents. Define the split criteria and the number of parallel branches.
merge	Combine results from parallel branches. Configurable merge strategies: CopyOnWrite, LastWins, LLMResolve.
human_loop	Pause execution and wait for human approval. Displays a prompt and approve/reject buttons in the UI.
script	Execute a shell script or command. Useful for build steps, deployments, or data transformations.
team	Run a multi-agent team as a single workflow step. Select the team and provide the task prompt.
transform	Transform data between nodes using expressions. Map, filter, or reshape data flowing through the workflow.
validate	Validate data against a schema or set of rules before allowing execution to continue.
webhook	Send or receive an HTTP request as part of the workflow. Useful for triggering external services.

Retry Policies

Each node can be configured with a retry policy that defines what happens when it fails:

Exponential backoff: Retries with exponentially increasing delays.
Linear backoff: Retries with a fixed delay between attempts.
Static: Retries immediately with no delay.
Maximum retry count is configurable per node.

Routing Ports

Nodes have routing ports that determine execution flow:

pass / approve (success port): Execution continues on the success path.
fail / reject (failure port): Execution routes to the failure handler path.

Workflow Features

Approval gates: Human-in-the-loop approval via the human_loop node. The workflow pauses and waits until a human approves or rejects.
Data passing: Nodes pass structured data to downstream nodes through the workflow context. Each node can read outputs from any predecessor.
Auto-parallel detection: The workflow engine uses an LLM to automatically detect independent branches and runs them in parallel.
Crash recovery: Workflow runs are checkpointed. If the application crashes mid-workflow, execution resumes from the last completed step.
Pause/resume: Running workflows can be paused and resumed at any time.
Import/Export: Workflows can be exported as JSON and imported into other Aura Workshop instances.

Workflow Execution Tracking

Each workflow run is tracked with per-step execution status. The workflow run record includes:

Run ID, workflow ID, and start time
Overall status: running, completed, failed, paused, or waiting_approval
Per-node status: pending, running, completed, failed, or skipped
Data context: the accumulated data from all completed nodes
Error details for any failed nodes

View workflow runs from Settings > Workflows or via the /api/workflow/runs/{run_id} endpoint.

Workflow Best Practices

Start simple: begin with linear workflows (agent_task nodes in sequence) before adding branching and parallelism.
Use human_loop nodes for critical decisions that require human judgment.
Set appropriate retry policies: exponential backoff for API calls, static for script execution.
Use transform nodes to reshape data between steps rather than asking agents to do data formatting.
Test workflows with small inputs before scaling up.
Export working workflows as JSON backups before making changes.

Cloud Providers (Direct API)

Aura Workshop connects directly to the following provider APIs. Each provider is shown as an expandable card on the Models page.

Provider	API Base URL	Models
Anthropic	`api.anthropic.com`	Claude family (Opus, Sonnet, Haiku) for advanced reasoning, analysis, and code
OpenAI	`api.openai.com`	GPT series (GPT-4o, o3, o4-mini) for versatile general-purpose tasks
Google	`generativelanguage.googleapis.com`	Gemini multimodal models with large context windows
MiniMax	`api.minimax.io`	MiniMax models for text, voice, and video generation

Provider Authentication Details

Provider	Auth Type	Header	Free Tier
Anthropic	API key header	`x-api-key: sk-ant-...`	No
OpenAI	Bearer token	`Authorization: Bearer sk-...`	No
Google	Query parameter	`?key=AIza...`	Yes (Flash, Flash-Lite)
MiniMax	Bearer token	`Authorization: Bearer ...`	No

Configuring a Cloud Provider

Navigate to the Models page by clicking the Models icon in the sidebar.
Click a provider card to expand it.
Enter your API key in the key field.
Select a model from the dropdown (pre-populated with available models for that provider).
Adjust sampling parameters if needed (temperature, top_p, top_k, max_tokens).
Click Save or Apply. The model immediately becomes available in the quick model switcher.

Aggregator Services

Aggregator services provide access to many models through a single API key. All use OpenAI-compatible APIs with Bearer token authentication.

Aggregator	Highlights
OpenRouter	Unified API for 200+ models with automatic fallback. Many free model options available.
Together AI	Fast inference for open-source models: Llama, Qwen, DeepSeek, Mistral, Gemma.
Groq	Ultra-fast LPU inference with sub-second latency. Free tier with rate limits.
DeepSeek	High-performance reasoning and coding at low cost.
SiliconFlow	Cost-effective GPU cloud inference for Qwen and DeepSeek models.
Zhipu AI (z.ai)	GLM models with free tiers available. Auto-seeds MCP servers for vision and code reading.
Xiaomi / MiMo	Xiaomi's MiMo models for general tasks.
Moonshot / Kimi	Multilingual, long-context models optimized for Asian languages.
Mistral AI	European models strong in coding and multilingual tasks.

Local Providers

Run models on your own hardware with zero API cost. Aura Workshop auto-detects local inference servers on standard ports.

Provider	Default Port	Description
Aura AI	8080	Bundled inference engine (Go + llama.cpp) with Metal/CUDA/CPU support.
Ollama	11434	Popular local model runner with a simple pull-and-run workflow.
LM Studio	1234	Desktop app for running local models with an OpenAI-compatible API.
LocalAI	varies	Self-hosted AI inference with OpenAI-compatible endpoints.
vLLM	varies	High-throughput LLM serving with PagedAttention for production workloads.
TGI	varies	HuggingFace Text Generation Inference server.
SGLang	varies	Fast serving framework for large language models.

Custom Providers

Add any OpenAI-compatible endpoint as a custom provider. This allows you to connect to any inference service that speaks the OpenAI API format.

Name: A human-readable label for the provider.
Base URL: The API base URL (for example, https://my-vllm-server.com/v1).
API key: Authentication key if required.
API format: Select from OpenAI, Anthropic, Google, or OpenAI-compatible.
Auth type: Bearer token, API key header, query parameter, or none.
Test connection: Verify the endpoint is reachable before saving.

Aura AI (Built-in Local Inference)

Aura Workshop bundles the aura-inference engine (Go + llama.cpp) for running GGUF models locally. It supports Metal acceleration on macOS, CUDA on Linux and Windows, and CPU fallback. Zero API cost: everything runs entirely on your hardware.

Start / Stop Controls

On the Models page, the Aura AI section provides a Launch button to start the local inference server and a Stop button to shut it down. When running, the status indicator turns green.

Port Configuration

The default port is 8080. You can change it in the Aura AI settings panel. Make sure the chosen port is not in use by another application.

Model Selector

Click Scan to detect GGUF models in ~/.cache/huggingface/hub/ and ~/.cache/aura-inference/models/. Select a model from the dropdown. Both HuggingFace-cached and directly downloaded GGUF files are discovered.

Advanced Parameters

Parameter	Description	Default
GPU Layers	Number of model layers offloaded to GPU. Set to -1 to offload all layers.	-1
Context Size	Token context window size.	4096
Batch Size	Inference batch size for throughput optimization.	512
Flash Attention	Enable flash attention for faster inference on supported hardware.	Off
KV Cache Key Type	Data type for the key cache: q8_0 (quantized) or f16 (full precision).	q8_0
KV Cache Value Type	Data type for the value cache: q8_0 or f16.	q8_0
Thinking Mode	Enable extended reasoning/chain-of-thought for local models that support it.	Off

HuggingFace Model Downloader

Download GGUF models directly from HuggingFace without leaving the app. Nine curated models are available with one-click download:

Model	Size	Best For
Qwen3.5 9B	~5.8 GB	Best balance of quality and speed for general tasks
Qwen3 Coder 8B	~5 GB	Code generation and programming tasks
Qwen3 VL 8B	~5 GB	Vision + language multimodal tasks (image analysis)
Llama 3.3 70B	~42 GB	Top-tier quality for complex reasoning (requires significant RAM/VRAM)
Llama 3.1 8B	~4.9 GB	Reliable tool calling and function execution
Mistral Small 24B	~14 GB	Multilingual tasks with vision capabilities
Phi-4 14B	~8.4 GB	Strong reasoning and math
Gemma 4 E4B	~5 GB	Compact vision model for image understanding
Gemma 4 27B-A4B	~16.8 GB	Mixture of Experts model with excellent efficiency

Curated Model Table

Each model row in the table has:

Download button: Start downloading the model.
Cancel button: Appears during download to cancel the operation.
Delete button: Remove a downloaded model from disk.
Progress bar: Shows download progress with percentage.
ETA and speed: Estimated time remaining and current download speed.

Custom Model Download

Enter any HuggingFace repo ID (for example, Owner/ModelName-GGUF) in the custom download input. The app auto-resolves GGUF repos and downloads Q4_K_M quantization by default. Provide a HuggingFace token for gated models. Downloads are verified against SHA2 checksums.

Ollama Integration

If Ollama is running locally on http://localhost:11434, Aura Workshop auto-detects it and lists available models. No API key is required.

Pull Input

On the Models page under the Ollama section, enter a model name (for example, llama3.1) in the pull input field and click the Pull button. Ollama downloads the model and makes it available immediately.

List and Delete

The Ollama section shows a list of all locally available Ollama models. Each model has a Delete button to remove it.

Inference Cluster (LAN GPU Sharing)

The inference cluster feature enables distributed GPU inference across multiple machines on your local network. Combine the GPU resources of several computers to run larger models than any single machine could handle.

Discovery

Aura Workshop nodes broadcast their presence via UDP on port 18801 every 30 seconds. Nodes that have not broadcast for 90 seconds are considered expired and removed from the cluster view. LAN discovery can be toggled with the lan_discovery_enabled setting.

Roles

Master: Manages the cluster, coordinates model layer distribution across workers, and serves the unified inference API.
Worker: Contributes GPU resources to the cluster via RPC on port 50052.

Worker Claiming

The master discovers workers via UDP broadcast. To claim a worker, the master sends an HTTP POST to the worker's /api/cluster/join endpoint. The worker stores the master's information and starts its RPC server. The claiming process works as follows:

The master machine's Models page shows a "LAN Nodes" section listing all discovered workers on the network.
Each discovered worker shows its hostname, IP address, available GPU resources, and current state.
Click the Add button next to a discovered worker to claim it.
The master sends a claim request to the worker. The worker accepts and starts its RPC server on port 50052.
Once claimed, the worker appears in the "Claimed Workers" list.
When you start inference, the master distributes model layers across all claimed workers based on their GPU capabilities.

Workshop Modes

Each Aura Workshop instance can operate in one of three cluster modes, configured in the Models page:

Standalone (default): Single-machine inference. No cluster features. All model layers run on the local GPU.
Cluster Manager: Acts as the inference master, managing multiple workers remotely. Coordinates model distribution and serves the unified inference API.
GPU Contributor: Acts as an inference worker, contributing GPU resources to a master. The worker does not serve its own inference API.

Parameter Control

Cluster inference supports the same parameters as standalone Aura AI: quantization, context size, batch size, GPU layers, and flash attention. Parameters are set on the master and applied across the cluster.

Docker Cluster Deployment

# Master node
docker run -d --net=host --gpus all \
  -e AURA_DAEMON_MODE=inference-master \
  coolkoo/aura-workshop:daemon-latest \
  --mode inference-master

# Worker node
docker run -d --net=host --gpus all \
  -e AURA_DAEMON_MODE=inference-worker \
  coolkoo/aura-workshop:daemon-latest \
  --mode inference-worker

Use --net=host for UDP broadcast discovery and --gpus all for NVIDIA GPU access.

Quick Model Switching

Click the model name displayed in the top bar at any time to open a dropdown of all configured and available models. The dropdown groups models by provider for easy navigation. Selecting a model instantly switches the active model for all subsequent new tasks. Models from cloud providers, aggregators, local inference, and custom providers are all listed together. A search field at the top of the dropdown lets you filter models by name.

Model Auto-Detection

When you configure a provider's API key, Aura Workshop automatically queries the provider to discover available models. This means the model selector always shows current, accurate model options without manual configuration. For local providers (Ollama, Aura AI), clicking Scan refreshes the local model list.

Model Parameters

Each model can be configured with sampling parameters that affect output quality and style:

Parameter	Default	Description
Temperature	0.7	Controls randomness. Lower values produce more focused output; higher values are more creative.
Top P	0.8	Nucleus sampling threshold. Only tokens with cumulative probability above this threshold are considered.
Top K	20	Only the top K most likely tokens are considered at each step. 0 disables Top K filtering.
Min P	0.0	Minimum probability threshold. Tokens below this probability are discarded.
Repeat Penalty	1.0	Penalty applied to repeated tokens. Values above 1.0 discourage repetition.
Max Tokens	4096	Maximum number of tokens the model can generate in a single response.
Thinking Level	Off	Extended reasoning depth: Off, Low, Medium, High. Only available on models that support thinking.

Smart Routing

Smart routing automatically selects the most cost-effective model for each task based on its complexity.

Enable / Disable

Toggle smart routing in Settings > Routing. When disabled, all tasks use the globally selected model.

Tier Configuration

Four routing tiers are available, each with its own model selector and boundary score:

Tier	Intended Use
Simple	Quick questions, translations, simple formatting. Routed to the cheapest model.
Standard	Moderate tasks: writing, analysis, single-step coding.
Complex	Multi-step tasks: architecture, debugging, research. Routed to a capable model.
Reasoning	Tasks requiring deep reasoning, math, or extended chain-of-thought. Routed to the most powerful model.

Free Models Only Toggle

A toggle restricts routing to free models only (such as those available on OpenRouter or Groq free tier).

Analytics View

The routing analytics section shows detailed statistics about smart routing performance:

Tasks per tier: Bar chart showing how many tasks were classified into each tier (Simple, Standard, Complex, Reasoning).
Actual cost: The real cost of running all tasks with smart routing enabled.
Hypothetical cost: What the cost would have been if all tasks used the most capable (most expensive) model.
Total savings: The dollar amount saved by using smart routing. Calculated as hypothetical minus actual.
Savings percentage: The percentage of costs saved relative to the hypothetical total.

The analytics update in real-time as new tasks are routed. Reset the analytics data from the billing reset button.

Settings: General

Native mode toggle: Switch between native execution (commands run directly on your OS) and Docker execution (commands run in isolated containers).
Sampling parameters: Set default values for Temperature (0.7), Top P (0.8), Top K (20), and Max Tokens (4096) that apply to all new tasks.
Thinking level: Set the default reasoning depth for models that support it: Off, Low, Medium, High.
Execution isolation: Choose the isolation level for agent commands:
- none: Commands run directly on the host (default for native mode).
- sandbox: Commands run in a restricted process sandbox.
- container: Commands run inside Docker containers.
Execution backend: Where commands run: local, docker, ssh, or singularity (for HPC environments).
Max concurrent tools: Maximum number of tools that can run simultaneously within a single task (default: 4).
Elevated bash toggle: Allow agent bash commands to run with elevated privileges (sudo).
Device capabilities: Toggle screen capture, camera capture, and system notifications on or off.
Agent context: Toggle automatic injection of git status and recent commits into the system prompt.

Context Compression

Aura Workshop automatically manages context window limits to prevent long-running tasks from failing due to token limits:

Threshold: Compaction triggers when context usage reaches 70% of the active model's context window.
LLM summarization: Older conversation history is summarized by the LLM into a condensed form.
Pair preservation: tool_use and tool_result message pairs are always kept intact during compaction.
Truncation fallback: If summarization fails, the system falls back to mechanical truncation of the oldest messages.
Circuit breaker: After 3 consecutive compaction failures, the system switches to permanent truncation mode to ensure the task can continue.
Persistent state: The compressed state is saved to the task_memory database table, so progress survives crashes and can be resumed.

Crash Safety and Task Resume

The full conversation transcript is persisted to the SQLite database before every API call. If the application crashes or is force-quit, no conversation data is lost. Key features:

Task checkpoints are saved at key milestones during execution.
Interrupted or crashed tasks can be resumed from the Dashboard by clicking the Resume button.
Bulk resume: Select multiple interrupted tasks and resume them all at once.
Workflow checkpoints: Multi-agent workflows save progress per role, so a crash mid-workflow resumes from the last completed role handoff.
Context compression state is preserved, so resumed tasks maintain their compressed context.

Provider Health and Circuit Breaker

Provider health is tracked passively after each API request (zero overhead). If a provider fails 5 consecutive times, the circuit breaker opens and the agent automatically switches to the next provider in the fallback order. The circuit recovers after 60 seconds with a single probe request. You can view provider health status in Settings > Billing.

Execution Modes

Mode	Description
Native	Commands run directly on your host OS via `sh -c` (macOS/Linux) or `cmd /C` (Windows). Default when Docker is not available. No container overhead.
Docker	Commands run in isolated Docker containers. Provides sandboxing and dependency isolation at the cost of startup latency.

Execution Backend

The execution_backend setting determines where agent commands physically run:

local: Commands run on the local machine (default).
docker: Commands run in Docker containers.
ssh: Commands run on a remote machine via SSH.
singularity: Commands run in Singularity/Apptainer containers (for HPC environments).

Settings: Security

Biometric authentication toggle: Enable Touch ID (macOS) or Windows Hello for accessing credentials and sensitive operations.
Password lock: Set a password that must be entered to access the Settings panel. Provides an additional layer of protection beyond biometrics.
License key entry: Enter and activate your license key.
License status display: View the current license tier, expiration date, and feature entitlements.
Session management: View and revoke active JWT sessions for the web UI.

Settings: Connectivity

Setting	Description	Default
Web Server Enabled	Toggle the embedded HTTP server on or off	On
Port	HTTP port for the web server	18800
Auth Token	Optional Bearer token for remote access authentication. When set, all API requests must include this token.	Empty (no auth)

Environment variable overrides: AURA_WEB_ENABLED, AURA_WEB_PORT, AURA_WEB_TOKEN.

Settings: Integrations

Web Search provider: Select from DuckDuckGo (no key required), Google Custom Search (requires API key + search engine ID), Brave Search (requires API key), Serper (requires API key), or Bing Search (requires API key).
Email configuration: Choose between Auto (system default mail client) or SMTP (manual server configuration with host, port, username, password, from name, and from address). Includes a Test Email button.
Voice / TTS provider: Enable or disable text-to-speech. Select provider: System (OS built-in), OpenAI (requires API key), or ElevenLabs (requires API key). Choose voice and speech rate.
Speech-to-text provider: Select from Whisper, System, Groq, OpenAI, or xAI for voice input transcription.

Settings: Skills

Skill list: All installed skills with columns for name, category, and a preview of the prompt text.
Edit button: Opens the skill editor where you can modify the skill's prompt, metadata, and chain configuration.
Delete button: Remove a skill from the installation.
New Skill button: Create a brand new skill with a name, category, description, and prompt template.
Skill settings: Per-skill configuration overrides for advanced customization.

Settings: Plugins

Plugins extend Aura Workshop with additional capabilities beyond the built-in features.

Plugin list: View all installed plugins with their name, version, status (enabled/disabled), and description.
Enable/Disable toggle: Toggle individual plugins on or off. Disabled plugins are not loaded and consume no resources.
Plugin configuration: Each plugin may expose its own configuration fields. Click the settings icon next to a plugin to access its specific settings.
Plugin installation: Install new plugins from the plugin registry or by uploading a plugin package.
Plugin updates: Check for and apply updates to installed plugins.

Plugins are managed via the REST API at /api/plugins for programmatic control.

Settings: Design

The Design settings control visual aspects used by the design skills and generated artifacts.

Design system tokens: View and manage design system tokens including:
- Color palettes: primary, secondary, accent, background, surface, text colors
- Typography: font families, sizes, weights, line heights
- Spacing: margin and padding scales
- Border radii: corner radius values for UI components
- Shadow definitions: elevation levels for depth effects
Theme selection: Choose from available UI themes (dark, light, or custom) for the application interface.
Custom themes: Create and save custom themes by adjusting the design token values. Custom themes are persisted to the database.

Design tokens are available to agents via the /api/design-systems endpoint and are used by the design skills (taste-skill, impeccable, ui-ux-pro-max, etc.) when generating UI artifacts.

Settings: MCPs

Manage MCP server connections (see the MCP Servers section for full details). This tab provides the same interface described there: add, edit, delete, connect, disconnect, configure isolation mode, set OAuth credentials, and import from JSON.

Settings: Data Management

Action	Description
Clear conversation history	Delete all task messages and conversation data from the database. Tasks themselves are preserved.
Reset API keys	Remove all stored provider API keys from the encrypted credential store.
Clear model cache	Remove cached model metadata and force re-fetching from providers.
Reset database	Full database reset to factory defaults. All settings, tasks, conversations, and configurations are deleted.
Reset app data	Complete application reset including all settings, database, downloaded models, and cached files.
Diagnostics	View system information (OS, architecture, memory, disk), database statistics (table counts, sizes), and dependency status.

Each destructive action requires confirmation before execution.

Settings: Teams

Team list: View all configured teams with their names, member roles, and workflow type (sequence/parallel/fanout).
New Team button: Opens the Team Builder form to create a new team.
Edit: Modify team composition, roles, or workflow type.
Delete: Remove a team configuration.
Quick-start templates: Pre-configured team templates for common configurations (Software Dev Team, Content Writing Team, Research Team).

Settings: Roles & Prompts

Role list: All 20 built-in roles plus any custom roles, grouped by category (Software, Content, Business, Data, Design, Operations).
Role editor: Create or edit roles with fields for name, description, category, system prompt (multi-line text), and tool permissions (checkboxes for each available tool).
Duplicate button: Clone an existing role as a starting point for a new custom role.
Delete button: Remove custom roles. Built-in roles cannot be deleted.
Per-role model selection: Optionally assign a specific model to a role.

Settings: Workflows

The visual workflow editor provides the canvas-based interface described in the Workflows section. From this settings tab you can:

Create new workflows
Edit existing workflows in the visual editor
Import workflows from JSON files
Export workflows as JSON for sharing
Delete workflows
View workflow run history

Settings: Credentials

Credential list: Shows all stored credentials with name, type, and created/updated timestamps. Values are never displayed in the list.
Add credential: Create a new credential entry with a name, type (API key, token, password, SSH key), and value.
View credential: Retrieve a credential's decrypted value. Requires biometric authentication (Touch ID or Windows Hello).
Delete credential: Remove a credential permanently.
Encryption: All values are encrypted with AES-256-GCM before writing to the database. The encryption key is stored in the system keychain.
Usage: Credentials are referenced by ID in listeners, webhooks, MCP server connections, and remote deployments, so tokens are never exposed in configuration files.

Settings: Cloud Storage

AWS S3: Connect with access key and secret key. Configure bucket name and region.
Google Cloud Storage: OAuth-based authentication. Configure bucket name.
Azure Blob Storage: Connection string or managed identity. Configure container name.

Connected cloud storage is used by agents for storing generated files, backups, and shared artifacts.

Settings: Memory

View memories: List all saved agent memories organized by type (user, feedback, project, reference). Each entry shows a title, content preview, and timestamps.
Add memory: Manually create a new memory entry.
Edit memory: Modify an existing memory entry.
Delete memory: Remove individual memory entries.
Search and filter: Filter memories by type, scope (user vs. project), or search by content.
Memory locations: ~/.aura/memory/ for user-scope memories, .aura/memory/ in the project root for project-scope memories.
AURA.md editor: Create and edit project instruction files (see Cross-Session Memory for details).
Learned facts viewer: Browse, search, and delete LLM-extracted memory facts. Facts are color-coded by category with confidence percentages.

Settings: Routing

Smart routing toggle: Enable or disable automatic cost-based model routing.
Tier configuration: Set the model and boundary score for each tier (Simple, Standard, Complex, Reasoning).
Free models only toggle: Restrict routing to free models.
Routing analytics: View tasks per tier, actual vs. hypothetical cost, and total savings.

Settings: Billing & Usage

Usage dashboard: Summary cards for today's cost, this month's cost, total cost, and today's tokens.
Spend limits: Set maximum monthly spend per provider. When reached, the system switches to the next provider in the fallback order.
Provider fallback order: Define the priority sequence for automatic provider switching.
Model pricing table: Editable pricing with input/output cost per million tokens. Pre-seeded with current market rates. Reset to defaults button.
Daily usage charts: Interactive bar chart showing daily spend over the last 14 days, plus area charts for per-model daily breakdown.
Provider summary table: Per-provider totals with input/output token counts and costs.

Settings: Commands

Slash command list: View all custom slash commands with name, description, handler type, and enabled status.
Create / edit / delete slash commands (see Slash Commands).

Settings: Models

Provider API key management: Enter and manage API keys for all supported providers.
Provider URL overrides: Override the default base URL for any provider (useful for proxies or enterprise endpoints).
OpenAI Organization / Project IDs: Set optional Organization ID and Project ID for OpenAI API calls.
Model parameter defaults: Set global default sampling parameters.

Settings: Updates

Check for updates: Query the update server for new versions.
Version comparison: See a changelog of what has changed between your current version and the latest.
Download and install: One-click update with automatic restart. The application downloads the update, verifies integrity, and restarts.
Update toast: A notification appears automatically when an update is available, with a button to apply it.

Spend Tracking & Billing

Spend Summary Cards

At the top of the Billing tab, four summary cards provide at-a-glance metrics:

Today's cost: Total spend across all providers for the current day.
This month's cost: Cumulative monthly spend.
Total cost (all time): Lifetime spend since first use.
Per-provider breakdown: A card for each active provider showing its individual monthly spend.

Daily Usage Chart

An interactive bar chart visualizes daily spend over the last 14 days. Hover over any bar to see the exact cost for that day. Below the bar chart, area charts show per-model daily usage so you can identify which models are driving costs.

Model Usage Breakdown Table

A detailed table lists every model used during the billing period, with columns for model name, provider, input tokens, output tokens, number of requests, and total cost.

Spend Limits

Set a maximum monthly spend for each provider. The system enforces the limit by automatically switching to the next provider in the fallback order when a limit is reached. You can also set alert thresholds that trigger a notification before the hard limit is hit.

Provider Fallback Order

Define the priority sequence for provider switching. Drag and drop providers to reorder them. When the primary provider hits its spend limit, encounters rate limits (HTTP 429), or returns server errors (HTTP 503), the system transparently switches to the next provider in the list.

Provider Pricing Override

The model pricing table shows input and output cost per million tokens for every model. These values are pre-seeded with current market rates but can be edited manually. A "Reset to Defaults" button restores the original pricing data.

Usage Stats in Context Panel

During task execution, the Context Panel (right sidebar) shows real-time usage: Context Usage percentage (with color-coded bar), Input Tokens, Output Tokens, Cache Read tokens, Total Cost, Latency, Model name, and Provider name.

Web UI (Browser Access)

Aura Workshop includes an embedded axum HTTP server that serves the full SolidJS frontend through any web browser. Port 18800 is the default.

Enabling the Web Server

Go to Settings > Connectivity.
Toggle Web UI Server on.
Set the port (default: 18800).
Optionally set a Bearer token for authentication.
Open http://localhost:18800 in any browser on the network.

Authentication

When a Bearer token is configured, all API requests must include Authorization: Bearer your-secret-token in the headers. The /api/health and /api/heartbeat/incoming endpoints are always public and do not require authentication.

Environment Variable Configuration

export AURA_WEB_ENABLED=true
export AURA_WEB_PORT=18800
export AURA_WEB_TOKEN=your-secret-token

Feature Parity

The browser-based UI has full feature parity with the desktop app: all navigation views, real-time streaming via SSE, the complete REST API, file upload, folder selection, voice input, and all settings tabs.

Remote Agent Deployment

Deploy AI agents to remote machines via SSH for always-on, headless operation.

How Deployment Works

Go to Settings > Agents or use the deploy_remote tool.
Provide SSH connection details: hostname, username, and authentication method (key file, password, or saved credential).
Select the deployment mode: Full Agent, Inference Master, Inference Worker, or Worker.
Aura Workshop connects via SSH, detects the remote OS, auto-installs Docker if needed, pulls the Docker image, and starts the daemon container with --net=host.

SSH Authentication Methods

Method	Description
Key file	Path to an SSH private key file (RSA, Ed25519, etc.). The most secure and recommended method.
Password	Username/password authentication. Uses `sshpass` for non-interactive authentication.
Saved credential	References a credential from the encrypted credential store by ID. Keys are decrypted at deployment time.

Deployment Modes

Mode	Description
Full Agent	Complete daemon with tasks, inference, listeners, webhooks, and workflows.
Inference Master	Manages an inference cluster: models, workers, serves the inference API.
Inference Worker	Broadcasts on LAN, joins a master's cluster, contributes GPU resources via RPC.
Worker	Task execution worker that uses a master's inference endpoint for model access.

Monitoring

Heartbeat: Deployed agents send periodic heartbeats to the master for status monitoring. If heartbeats stop, the deployment is marked as offline.
Pairing codes: Remote deployments can be paired with the desktop app using a one-time pairing code for secure initial connection.
Status indicators: Each remote deployment shows its connection status (online/offline), uptime, daemon mode, and resource usage.
Viewer access: The remote deployment serves the viewer SPA for browser-based monitoring. Access it at http://remote-host:18800.

Deploying Automation to Remote Agents

Schedules, listeners, and webhooks can be deployed to remote machines. When you create an automation item and select a target agent, the configuration is synced to the remote deployment. Deleting an automation item also cleans up its remote deployment automatically.

Docker Daemon

The aura-daemon binary provides headless operation for Docker deployments.

Repository and Tags

Repository: coolkoo/aura-workshop
Tags: daemon-latest, daemon-{version}, daemon-arm64
Platforms: linux/amd64 (with CUDA runtime), linux/arm64 (with Vulkan drivers)

4 Daemon Modes

Mode	Flag	Purpose
full	`--mode full`	Complete daemon: tasks, inference, listeners, webhooks, workflows, and full REST API
inference-master	`--mode inference-master`	Manages the inference cluster: model distribution, worker coordination, serves the inference API
inference-worker	`--mode inference-worker`	Broadcasts on LAN, joins a master's cluster, contributes GPU resources via RPC
worker	`--mode worker`	Task execution worker that uses a master's inference endpoint for model access

Running the Daemon

docker run -d \
  --name aura-daemon \
  --net=host \
  -v ~/.aura:/root/.aura \
  -v $(pwd)/data:/data \
  -e AURA_WEB_ENABLED=true \
  -e AURA_WEB_PORT=18800 \
  -e AURA_API_KEY=your-api-key \
  -e AURA_MODEL=deepseek-chat \
  -e AURA_BASE_URL=https://api.deepseek.com \
  coolkoo/aura-workshop:daemon-latest \
  --mode full

Environment Variables

Variable	Description	Default
`AURA_WEB_ENABLED`	Enable the embedded web server	`true`
`AURA_WEB_PORT`	Web server port	`18800`
`AURA_WEB_TOKEN`	Bearer token for API authentication	(none)
`AURA_DB_PATH`	SQLite database file path	`/data/aura-workshop.db`
`AURA_API_KEY`	API key for the LLM provider	(none)
`AURA_MODEL`	Model identifier to use	(none)
`AURA_BASE_URL`	Base URL for the LLM provider API	(none)
`AURA_NATIVE_MODE`	Use native mode instead of Docker-in-Docker	`false`
`AURA_REMOTE_DEPLOYMENT`	Mark as a remote deployment	`false`
`AURA_VIEWER_MODE`	Serve the viewer SPA instead of the full UI	`false`
`AURA_PAIRING_CODE`	One-time pairing code for desktop app connection	(none)
`AURA_HEARTBEAT_URL`	URL to send heartbeats to the master instance	(none)
`AURA_DEPLOYMENT_ID`	Unique identifier for this deployment	(none)
`AURA_DAEMON_MODE`	Daemon operating mode	`full`
`AURA_RPC_PORT`	RPC port for distributed inference	`50052`
`AURA_MASTER_INFERENCE_URL`	URL of the master's inference API (worker mode)	(none)

Health Check

The Docker image includes a built-in health check:

HEALTHCHECK --interval=30s --timeout=5s --retries=3
    CMD curl -f http://localhost:18800/api/health || exit 1

What Is Included in the Image

aura-daemon binary (Rust) -- the headless core of Aura Workshop
aura-inference binary (Go + llama.cpp) -- architecture-appropriate version for local model inference
Viewer frontend SPA for browser-based monitoring and interaction
All bundled skills across all categories (documents, Anthropic, superpowers, desktop apps, design, platform, media)
Tesseract OCR engine for image text extraction via the ocr tool
CUDA runtime libraries (amd64 image) or patched Mesa Vulkan drivers (arm64 image) for GPU acceleration

Volume Mounts

The recommended volume mounts for a full daemon deployment:

docker run -d \
  --name aura-daemon \
  --net=host \
  -v ~/.aura:/root/.aura \                    # User memory, roles, CLI config
  -v $(pwd)/data:/data \                       # Database storage
  -v $(pwd)/models:/root/.cache/aura-inference/models \  # GGUF models
  coolkoo/aura-workshop:daemon-latest \
  --mode full

Host Path	Container Path	Purpose
`~/.aura`	`/root/.aura`	User memory files, custom roles, CLI configuration
`./data`	`/data`	SQLite database, charts, generated files
`./models`	`/root/.cache/aura-inference/models`	Downloaded GGUF model files

GPU Passthrough

For NVIDIA GPUs on Linux, use the --gpus all flag to pass through GPU resources:

docker run -d --net=host --gpus all \
  -v $(pwd)/models:/root/.cache/aura-inference/models \
  -v $(pwd)/data:/data \
  coolkoo/aura-workshop:daemon-latest \
  --mode inference-master

Ensure the NVIDIA Container Toolkit is installed on the host. On macOS with Colima, GPU passthrough requires krunkit.

Auto Docker Installation for Remote Deployment

When deploying to remote machines via SSH, Aura Workshop can auto-install Docker:

Linux: Uses the official get.docker.com installation script.
macOS: Installs Colima via Homebrew (with optional krunkit for GPU passthrough).
Windows: Provides manual installation instructions with links to Docker Desktop documentation.

Headless Mode

For environments without a display server (headless Linux servers), you can run Aura Workshop without a GUI:

xvfb-run aura-workshop

The application starts without rendering a window but serves the full web UI for browser-based access. For production headless deployments, the Docker daemon is the recommended approach.

Headless mode is configurable via settings: when the headless flag is enabled, the application skips all GUI initialization and runs purely as a server process. All features remain available through the REST API and web UI.

Security Architecture

Encryption at Rest

AES-256-GCM: All credentials and API keys stored in the database are encrypted using AES-256-GCM (via the aes-gcm Rust crate). Each value has a unique nonce.
System keychain: The master encryption key is stored in the operating system's secure keychain (macOS Keychain, Windows Credential Manager, Linux secret service). It never touches the filesystem.

Authentication

Biometric auth: Touch ID (macOS) and Windows Hello gate access to credential retrieval and sensitive settings.
JWT sessions: The web UI uses JSON Web Tokens (via the jsonwebtoken Rust crate) for session management.
Bearer token: The REST API supports optional Bearer token authentication for remote access.

Webhook Security

Incoming webhook requests can be validated using HMAC-SHA256 signatures against a shared secret. Invalid signatures are rejected before processing.

Download Verification

Model downloads from HuggingFace are verified using SHA2 checksums to prevent tampering or corruption.

Command Safety Guardrails

The agent middleware blocks dangerous shell commands automatically and unconditionally. Fork bombs, recursive root deletion, disk formatting, and other destructive patterns are always blocked regardless of configuration. These guardrails cannot be disabled.

Zero Telemetry

Aura Workshop sends zero telemetry and includes no analytics tracking. All data remains on your machine. The application is fully functional in air-gapped environments when paired with local models.

Local-First Architecture

All data is stored in a local SQLite database. There is no cloud sync, no remote storage of your conversations, and no data leaving your machine except the LLM API calls you explicitly configure. When using local inference (Aura AI or Ollama), no data leaves your network at all.

Execution Isolation

Three isolation levels are available for agent commands:

none: Commands run directly on the host.
sandbox: Commands run in a restricted process sandbox.
container: Commands run inside Docker containers with limited host access.

Chrome Extension

Aura Workshop includes a Chrome extension that provides a side panel chat interface and browser automation capabilities.

Side Panel

Opens as a Chrome side panel with a full chat interface.
Renders Markdown, code blocks, and inline formatting in responses.
Connects to Aura Workshop via WebSocket at /ws/sidepanel.
Supports streaming responses, tool usage display, and real-time updates.
Tab-aware: the agent can see the current URL and page context of the active tab.
File attachment support for including documents in conversations.

Context Menus

Right-click on any text, link, or page element to access context menu options for quick agent invocation. For example, you can right-click selected text and choose "Ask Aura about this" to send it directly to the agent with the page context.

Screenshot Capture

The extension can capture screenshots of the current browser tab and send them to the agent for visual analysis.

Browser Automation via Agents

Through the /ws/browser WebSocket connection, agents can control the browser programmatically using the browser_action tool. Supported actions include:

Navigate: Open a URL in the browser
Click: Click on a page element by selector or coordinates
Type: Enter text into form fields
Take screenshot: Capture the current page state as an image
Get page content: Extract the text content of the current page
Extract elements: Find and extract specific DOM elements by CSS selector
Scroll: Scroll the page up, down, or to a specific element
Wait: Wait for a specific element to appear or a condition to be met

Setup

Load the extension from the extension/ directory in Chrome's developer mode (navigate to chrome://extensions, enable Developer mode, and click "Load unpacked").
Click the extension icon and configure it to point to your Aura Workshop web server URL (for example, http://localhost:18800).
If authentication is enabled on your server, enter the Bearer token in the extension settings.
The extension icon turns green when successfully connected to the server.

Embed an Aura Workshop-powered chat interface on any website with a simple JavaScript snippet.

<script src="http://your-server:18800/widget.js"></script>
<script>
  AuraWidget.init({
    serverUrl: "http://your-server:18800",
    token: "your-auth-token",      // optional
    position: "bottom-right",       // bottom-right or bottom-left
    title: "AI Assistant",          // widget title
    greeting: "How can I help?",    // initial greeting message
    theme: "dark"                   // dark or light
  });
</script>

Features

WebSocket real-time communication: Messages stream in real-time via WebSocket.
Custom styling: Configure color scheme, position (bottom-right or bottom-left), title text, and greeting message.
Persistent history: Conversation history persists per user across page loads.
Markdown rendering: Responses render with full Markdown formatting including code blocks.
Configuration: Set up the widget by creating a "chatbot" type listener in the Listeners tab.

Voice & TTS

Speech-to-Text Providers

Provider	Description	Requirements
Whisper	OpenAI's Whisper model for transcription	OpenAI API key
System	Operating system's built-in speech recognition	None
Groq	Ultra-fast transcription via Groq LPU	Groq API key
OpenAI	OpenAI's transcription API	OpenAI API key
xAI	xAI's transcription service	xAI API key

Text-to-Speech Providers

Provider	Description	Requirements
System	OS built-in TTS (macOS `say` command, Windows SAPI)	None
OpenAI TTS	High-quality neural text-to-speech via OpenAI API	OpenAI API key
ElevenLabs	Premium voice synthesis with a wide selection of voices	ElevenLabs API key

Voice Input Button

The Voice button in the task composer activates the microphone. A pulsing red dot appears when recording. Speak your task description and the audio is transcribed using the configured speech-to-text provider and inserted into the textarea.

Configurable Voices and Speech Rate

In Settings > Integrations, select the desired voice from the provider's available options and adjust the speech rate to your preference.

Voice Configuration Details

Setting	Description	Options
TTS Enabled	Master toggle for text-to-speech on agent responses	On/Off
TTS Provider	Which service generates the speech audio	System, OpenAI, ElevenLabs
Voice	Specific voice to use (depends on provider)	Provider-specific list
Speech Rate	Speed of the generated speech	0.5x to 2.0x
STT Provider	Which service transcribes voice input	Whisper, System, Groq, OpenAI, xAI

When TTS is enabled, agent responses are automatically converted to audio and played through your speakers. You can also trigger speech manually for any specific message.

Cross-Session Memory

Aura Workshop maintains a persistent memory system that learns from every task and adapts to your preferences over time. Memory persists across sessions and even across application restarts.

Memory Types

Type	Location	Scope	How Created
User memory	`~/.aura/memory/*.md`	Global (all projects)	Agent saves via tool or manual creation
Feedback memory	`memory_facts` table	Global or project-scoped	LLM-extracted corrections and reinforcements
Project memory	`.aura/memory/*.md`	Per-project	Auto-extracted from file operations after task completion
Reference memory	`task_memory` table	Per-task	Handoff context between team roles and compaction summaries

Memory Fact Categories

Facts extracted from conversations are categorized with confidence scores:

Category	Scope	Confidence	Example
`preference`	Global	0.9	"User prefers Python over JavaScript for backend"
`correction`	Global	0.95	"User correction: use FastAPI not Flask"
`reinforcement`	Global	0.90	"User confirmed: single bundled PR is correct"
`knowledge`	Project	0.6-0.9	"Project uses PostgreSQL on port 5432"
`file_operation`	Project	0.7	"File modified: src/models.py"
`context`	Project	0.6-0.8	"API routes defined in src/routes/"
`behavior`	Project	0.7	"Always run tests after code changes"
`goal`	Project	0.7	"Goal is to migrate from Express to FastAPI"

Auto-Extraction

After every task, the system runs an LLM call to extract structured facts plus pattern detection for corrections ("no", "wrong", "instead") and reinforcements ("yes", "perfect", "great"). Corrections are saved at 0.95 confidence; reinforcements at 0.90.

AURA.md -- Project Instructions

Drop an AURA.md (or CLAUDE.md) file in your project root with rules the agent must follow. The system scans for: AURA.md, .aura.md, CLAUDE.md, .claude.md, .aura/INSTRUCTIONS.md, scanning from the project root up to two parent directories. Cap: 4 KB per file, 12 KB total.

AURA.md Editor

Create and edit AURA.md from Settings > Memory > Project Instructions. The editor provides:

7 predefined rule presets: TypeScript Strict, Python Best Practices, Test First, Clean Git, API Design, Security, Documentation
Directory path selector to specify where to save the file
Monospace editor with syntax-appropriate font
Load Existing button to read an existing AURA.md from any directory
Save button that writes the file immediately

Git Context Injection

When a task's project is a git repository, the agent automatically receives current git state in its system prompt:

git status --short: Current uncommitted changes
git log --oneline -5: The 5 most recent commits

This helps the agent understand what you have been working on and avoid conflicting changes. Toggle this feature in Settings > General > Agent Context.

Memory Viewer

View, search, and delete learned facts in Settings > Memory > Learned Facts. Facts are color-coded by category:

Red: Corrections (highest confidence, 0.95)
Blue: Preferences (high confidence, 0.9)
Green: Reinforcements (high confidence, 0.9)
Gray: Knowledge, context, and other facts (variable confidence)

Each fact displays its content, confidence percentage, category, scope (global or project), and creation date. Delete individual facts by clicking the trash icon.

Prompt Cache Stats

The static portion of the system prompt (including memory, AURA.md, and role definitions) is cached by providers that support prompt caching (Anthropic, OpenAI, Google). View cache hit rates and token savings in Settings > Billing > Prompt Cache Stats. Higher cache hit rates mean faster response times and lower costs.

Memory Scope and Isolation

Global facts (preference, correction, reinforcement) are available to all tasks across all projects.
Project-scoped facts (knowledge, context, file_operation) only appear for tasks in the same project path. A React project's facts will not leak into a Python project.

Memory Injection

Before every task, the system prompt is enriched with relevant memory through a multi-layer injection process:

File-based memories: The top 15 entries (approximately 2000 character budget) from ~/.aura/memory/ (user scope) and .aura/memory/ (project scope) are loaded.
Keyword-searched facts: The user's message is tokenized into keywords and searched against the memory_facts table, filtered by project isolation rules (approximately 1500 character budget).
Recent high-confidence facts: The 5 most recent facts with confidence scores at or above 0.8 are always included regardless of keyword matching.
Pre-compaction flush: Before context compaction occurs, file operation facts are extracted from messages that are about to be dropped, ensuring no learnings are lost during compression.

This multi-layer approach ensures that the agent always has access to the most relevant context, from explicit project rules to learned preferences and domain knowledge.

AURA.md vs Memory

Understanding when to use project instructions versus memory:

Situation	Use	Why
"Every task in this project must use TypeScript strict"	AURA.md	Deterministic, version-controlled rule
"I prefer Python over JavaScript"	Memory	Auto-detected as preference across all projects
"Don't use Flask, use FastAPI"	Memory	Auto-detected as correction (0.95 confidence)
"All commits must follow conventional format"	AURA.md	Explicit project rule for consistency
"The database is PostgreSQL on port 5432"	Memory	Auto-extracted project knowledge fact

AURA.md = rules you write explicitly. Deterministic, version-controlled, immediately effective. Memory = facts the agent learns from interactions. Probabilistic, compounds over time, adapts to corrections.

REST API Overview

Aura Workshop exposes approximately 140 REST endpoints through its embedded HTTP server (default port 18800). Every feature available in the desktop app is also accessible via HTTP. All endpoints are prefixed with /api unless otherwise noted.

Base URL: http://localhost:18800/api
Content-Type: application/json for all POST/PUT requests
Authentication: Optional Bearer token (configured in Settings > Connectivity)

Authentication

When a token is configured, include it as a Bearer token in all requests:

curl -H "Authorization: Bearer YOUR_TOKEN" http://localhost:18800/api/tasks

If no token is configured, all API requests are allowed without authentication. The /api/health and /api/heartbeat/incoming endpoints are always public.

Error Handling

All API endpoints return standard HTTP status codes. Error responses include a JSON body with an error field:

# Error response format
{
  "error": "Task not found",
  "code": "NOT_FOUND"
}

# Common HTTP status codes:
# 200 OK         - Request succeeded
# 201 Created    - Resource created
# 400 Bad Request - Invalid request body or parameters
# 401 Unauthorized - Missing or invalid auth token
# 404 Not Found  - Resource does not exist
# 429 Too Many Requests - Rate limited (forwarded from provider)
# 500 Internal Server Error - Server-side error

Pagination

List endpoints that may return large result sets support optional limit and offset query parameters:

# Get the first 10 tasks
curl -H "Authorization: Bearer $TOKEN" \
  "http://localhost:18800/api/tasks?limit=10&offset=0"

# Get the next 10 tasks
curl -H "Authorization: Bearer $TOKEN" \
  "http://localhost:18800/api/tasks?limit=10&offset=10"

Tasks

Method	Endpoint	Description
`GET`	`/api/tasks`	List all tasks with status, title, timestamps, and model information
`POST`	`/api/tasks`	Create a new task with a message, optional file attachments, project path, model, and role
`GET`	`/api/tasks/{id}`	Get a specific task by ID
`DELETE`	`/api/tasks/{id}`	Delete a task and all its messages
`GET`	`/api/tasks/{id}/messages`	Get all messages for a task
`POST`	`/api/tasks/{id}/messages`	Send a follow-up message to an existing task
`GET`	`/api/tasks/interrupted`	List all interrupted tasks that can be resumed
`GET`	`/api/tasks/{id}/files`	List files created or modified by a task

# Create a task
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"message": "Create a Python script that generates prime numbers", "project_path": "/home/user/myproject"}' \
  http://localhost:18800/api/tasks

# Response
{
  "id": "task_abc123",
  "title": "Create a Python script that generates prime numbers",
  "status": "executing",
  "model": "claude-sonnet-4-20250514",
  "created_at": "2026-05-21T10:30:00Z"
}

Get Task Messages

# Get all messages for a task
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/tasks/task_abc123/messages

# Response
{
  "messages": [
    {
      "id": "msg_001",
      "role": "user",
      "content": "Create a Python script that generates prime numbers",
      "timestamp": "2026-05-21T10:30:00Z"
    },
    {
      "id": "msg_002",
      "role": "assistant",
      "content": "I'll create a Python script...",
      "tool_calls": [
        {
          "tool": "write_file",
          "input": {"path": "primes.py", "content": "..."},
          "output": "File written successfully",
          "status": "ok"
        }
      ],
      "timestamp": "2026-05-21T10:30:15Z"
    }
  ]
}

Send a Follow-Up Message

# Send a follow-up message to an existing task
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"message": "Also add a function to check if a number is prime"}' \
  http://localhost:18800/api/tasks/task_abc123/messages

List Interrupted Tasks

# Get all tasks that can be resumed
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/tasks/interrupted

# Response
{
  "tasks": [
    {
      "id": "task_xyz789",
      "title": "Build a REST API",
      "status": "interrupted",
      "interrupted_at": "2026-05-21T09:15:00Z"
    }
  ]
}

Conversations and Messages

Method	Endpoint	Description
`GET`	`/api/conversations`	List all conversations
`POST`	`/api/conversations`	Create a new conversation
`DELETE`	`/api/conversations/{id}`	Delete a conversation and its messages
`PUT`	`/api/conversations/{id}/title`	Update a conversation title
`GET`	`/api/conversations/{id}/messages`	Get all messages in a conversation
`POST`	`/api/conversations/{id}/messages`	Add a message to a conversation

Chat and Agent (SSE Streaming)

These endpoints return Server-Sent Events (SSE) streams for real-time output.

Method	Endpoint	Description
`POST`	`/api/chat/send`	Send a chat message and stream the response (no tool access)
`POST`	`/api/chat/enhanced`	Send a chat message with full tool access (streaming)
`POST`	`/api/agent/run`	Run a standalone agent task with full tool access (streaming)
`POST`	`/api/tasks/{id}/run`	Run an existing task (streaming)
`POST`	`/api/tasks/{id}/resume`	Resume an interrupted task (streaming)
`GET`	`/api/events`	Global SSE event stream for all task, workflow, and system events
`POST`	`/api/inference/stop`	Stop inference for a specific task (task ID in request body)

# Run an agent with SSE streaming
curl -N -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"message": "What is the capital of France?"}' \
  http://localhost:18800/api/agent/run

# SSE events received:
# data: {"type":"text","content":"The capital of France is Paris."}
# data: {"type":"done","status":"completed"}

The SSE stream emits events with types: text, tool_use, tool_result, thinking, plan, status, error, and done.

Platform and Settings

Method	Endpoint	Description
`GET`	`/api/platform`	Returns the platform identifier (darwin, windows, linux)
`GET`	`/api/settings`	Get all application settings as a JSON object
`PUT`	`/api/settings`	Save application settings
`POST`	`/api/settings/test`	Test LLM connection with current settings
`POST`	`/api/email/test`	Send a test email to verify email configuration
`GET`	`/api/auth/check`	Verify that the provided auth token is valid

Listeners

Method	Endpoint	Description
`GET`	`/api/listeners`	List all listeners
`POST`	`/api/listeners`	Create a new listener
`PUT`	`/api/listeners/{id}`	Update a listener
`DELETE`	`/api/listeners/{id}`	Delete a listener
`POST`	`/api/listeners/{id}/start`	Start a listener
`POST`	`/api/listeners/{id}/stop`	Stop a listener
`POST`	`/api/listeners/{id}/toggle`	Toggle a listener enabled/disabled
`GET`	`/api/listeners/{id}/logs`	Get event logs for a listener
`GET`	`/api/listeners/statuses`	Get running/stopped status of all listeners
`GET`	`/api/listeners/platforms`	Get the list of supported listener platforms

# Create a Slack listener
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Slack Support Bot",
    "platform": "slack",
    "token": "xoxb-...",
    "trigger_type": "mentions",
    "prompt": "You are a helpful support agent. Answer the user question based on our documentation.",
    "model": "claude-sonnet-4-20250514",
    "enabled": true
  }' \
  http://localhost:18800/api/listeners

# Response
{
  "id": "listener_abc123",
  "name": "Slack Support Bot",
  "platform": "slack",
  "status": "stopped",
  "created_at": "2026-05-21T10:00:00Z"
}

# Start the listener
curl -X POST -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/listeners/listener_abc123/start

# Get event logs
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/listeners/listener_abc123/logs

# Response
{
  "events": [
    {
      "timestamp": "2026-05-21T10:05:00Z",
      "sender": "alice",
      "message": "@bot How do I reset my password?",
      "response": "To reset your password, go to Settings > Account...",
      "channel": "#support"
    }
  ]
}

# Get all supported platforms
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/listeners/platforms

Schedules

Method	Endpoint	Description
`GET`	`/api/schedules`	List all scheduled tasks
`POST`	`/api/schedules`	Create a new scheduled task
`PUT`	`/api/schedules/{id}`	Update a scheduled task
`DELETE`	`/api/schedules/{id}`	Delete a scheduled task
`POST`	`/api/schedules/{id}/toggle`	Toggle a scheduled task enabled/disabled

# Create a daily schedule
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Daily Report",
    "schedule_type": "daily",
    "time": "09:00",
    "message": "Generate a summary of yesterday git commits",
    "duration_type": "forever"
  }' \
  http://localhost:18800/api/schedules

Webhooks

Method	Endpoint	Description
`GET`	`/api/webhooks`	List all webhooks
`POST`	`/api/webhooks`	Create a new webhook
`PUT`	`/api/webhooks/{id}`	Update a webhook
`DELETE`	`/api/webhooks/{id}`	Delete a webhook
`POST`	`/api/webhooks/{id}/toggle`	Toggle a webhook enabled/disabled
`GET`	`/api/webhooks/{id}/url`	Get the auto-generated URL for a webhook
`GET`	`/api/webhooks/{id}/logs`	Get invocation logs for a webhook

# Create a webhook
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "GitHub PR Webhook",
    "prompt": "A GitHub pull request event was received: {{payload}}. Review the PR changes and provide feedback.",
    "secret": "my-hmac-secret",
    "model": "claude-sonnet-4-20250514"
  }' \
  http://localhost:18800/api/webhooks

# Response
{
  "id": "webhook_abc123",
  "name": "GitHub PR Webhook",
  "url": "http://localhost:18790/webhooks/webhook_abc123",
  "enabled": true,
  "created_at": "2026-05-21T10:00:00Z"
}

# Get the webhook URL
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/webhooks/webhook_abc123/url

# Test the webhook with cURL
curl -X POST http://localhost:18790/webhooks/webhook_abc123 \
  -H "Content-Type: application/json" \
  -H "X-Hub-Signature-256: sha256=..." \
  -d '{"action":"opened","pull_request":{"title":"Add user auth","number":42}}'

Workflow API Examples

# Create a workflow
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Code Review Pipeline",
    "nodes": [
      {
        "id": "n1",
        "type": "agent_task",
        "prompt": "Analyze the code for bugs and security issues",
        "model": "claude-sonnet-4-20250514"
      },
      {
        "id": "n2",
        "type": "human_loop",
        "prompt": "Review the analysis and approve or reject"
      },
      {
        "id": "n3",
        "type": "agent_task",
        "prompt": "Generate a fix for the identified issues"
      }
    ],
    "edges": [
      {"from": "n1", "to": "n2", "port": "pass"},
      {"from": "n2", "to": "n3", "port": "approve"}
    ]
  }' \
  http://localhost:18800/api/workflows

# Run a workflow
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"input": {"repo": "my-org/my-repo", "branch": "feature/auth"}}' \
  http://localhost:18800/api/workflows/workflow_abc123/run

# Response
{
  "run_id": "run_xyz789",
  "workflow_id": "workflow_abc123",
  "status": "running",
  "started_at": "2026-05-21T10:00:00Z"
}

# Check workflow run status
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/workflow/runs/run_xyz789

# Approve a human loop step
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"action": "approve", "comment": "Looks good, proceed with the fix"}' \
  http://localhost:18800/api/workflow/approvals/approval_abc/resolve

Skills and Role Skills

Method	Endpoint	Description
`GET`	`/api/skills`	List all installed skills
`DELETE`	`/api/skills/{name}`	Delete a skill by name
`GET`	`/api/skills/{name}/content`	Get the content/prompt of a skill
`PUT`	`/api/skills/{name}/content`	Update a skill's content
`GET`	`/api/skill-settings`	Get skill settings
`PUT`	`/api/skill-settings`	Update skill settings
`GET`	`/api/role-skills`	List all role skills
`POST`	`/api/role-skills`	Save (create or update) a role skill
`GET`	`/api/role-skills/{name}`	Get a specific role skill by name
`DELETE`	`/api/role-skills/{name}`	Delete a role skill

MCP Servers

Method	Endpoint	Description
`GET`	`/api/mcp/servers`	List all configured MCP servers
`POST`	`/api/mcp/servers`	Save (create or update) an MCP server configuration
`DELETE`	`/api/mcp/servers/{id}`	Delete an MCP server configuration
`POST`	`/api/mcp/servers/{id}/connect`	Connect to an MCP server
`POST`	`/api/mcp/servers/{id}/disconnect`	Disconnect from an MCP server
`GET`	`/api/mcp/statuses`	Get connection status for all MCP servers
`GET`	`/api/mcp/tools`	List all tools from connected MCP servers

# Add an MCP server (stdio transport)
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "File System MCP",
    "transport": "stdio",
    "command": "npx",
    "args": ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/projects"],
    "env": {},
    "isolation": "shared"
  }' \
  http://localhost:18800/api/mcp/servers

# Add an MCP server (HTTP transport with OAuth)
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Cloud MCP",
    "transport": "http",
    "url": "https://mcp.example.com/sse",
    "oauth": {
      "client_id": "my-client-id",
      "client_secret": "my-client-secret"
    },
    "isolation": "per_task"
  }' \
  http://localhost:18800/api/mcp/servers

# Connect to an MCP server
curl -X POST -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/mcp/servers/mcp_abc123/connect

# List all available MCP tools
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/mcp/tools

# Response
{
  "tools": [
    {
      "name": "mcp_filesystem_read_file",
      "server": "File System MCP",
      "description": "Read a file from the filesystem",
      "input_schema": {
        "type": "object",
        "properties": {
          "path": {"type": "string", "description": "File path to read"}
        },
        "required": ["path"]
      }
    }
  ]
}

# Get connection statuses
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/mcp/statuses

# Response
{
  "statuses": {
    "mcp_abc123": {"connected": true, "tool_count": 5},
    "mcp_def456": {"connected": false, "error": "Connection refused"}
  }
}

Plugins

Method	Endpoint	Description
`GET`	`/api/plugins`	List all installed plugins
`POST`	`/api/plugins`	Install or update a plugin
`DELETE`	`/api/plugins/{id}`	Uninstall a plugin
`POST`	`/api/plugins/{id}/toggle`	Enable or disable a plugin

Workflows

Method	Endpoint	Description
`GET`	`/api/workflows`	List all automation workflows
`POST`	`/api/workflows`	Create a new workflow
`GET`	`/api/workflows/{id}`	Get a specific workflow
`PUT`	`/api/workflows/{id}`	Update a workflow
`DELETE`	`/api/workflows/{id}`	Delete a workflow
`POST`	`/api/workflows/{id}/run`	Run a workflow
`GET`	`/api/workflow/runs/{run_id}`	Get the status of a workflow run
`POST`	`/api/workflow/approvals/{id}/resolve`	Resolve a human approval request (approve or reject)

Teams

Method	Endpoint	Description
`GET`	`/api/teams`	List all teams
`POST`	`/api/teams`	Create a new team
`PUT`	`/api/teams/{id}`	Update a team
`DELETE`	`/api/teams/{id}`	Delete a team
`POST`	`/api/teams/run`	Run a team task

# Run a team task
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "team_id": "software-dev-team",
    "message": "Build a REST API for user management with authentication"
  }' \
  http://localhost:18800/api/teams/run

Credentials

Method	Endpoint	Description
`GET`	`/api/credentials`	List all stored credentials (metadata only, values not returned)
`POST`	`/api/credentials`	Save a new credential (value is encrypted before storage)
`GET`	`/api/credentials/{id}`	Get a credential with decrypted value (biometric auth gated)
`DELETE`	`/api/credentials/{id}`	Delete a credential

# Save a credential
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Slack Bot Token",
    "type": "token",
    "value": "xoxb-1234567890-abcdefghij"
  }' \
  http://localhost:18800/api/credentials

# Response
{
  "id": "cred_abc123",
  "name": "Slack Bot Token",
  "type": "token",
  "created_at": "2026-05-21T10:00:00Z"
}

# List credentials (values are never included in list responses)
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/credentials

# Response
{
  "credentials": [
    {"id": "cred_abc123", "name": "Slack Bot Token", "type": "token", "created_at": "2026-05-21T10:00:00Z"},
    {"id": "cred_def456", "name": "SSH Deploy Key", "type": "ssh_key", "created_at": "2026-05-20T14:00:00Z"}
  ]
}

# Get decrypted value (requires biometric auth on desktop)
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/credentials/cred_abc123

Cloud Storage

Method	Endpoint	Description
`GET`	`/api/cloud/connectors`	List cloud storage connectors
`POST`	`/api/cloud/connectors`	Save (create or update) a cloud connector
`DELETE`	`/api/cloud/connectors/{id}`	Delete a cloud connector

# Save a cloud storage connector
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Production S3",
    "type": "s3",
    "config": {
      "bucket": "my-aura-files",
      "region": "us-east-1",
      "access_key": "AKIAIOSFODNN7EXAMPLE",
      "secret_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
    }
  }' \
  http://localhost:18800/api/cloud/connectors

# List connectors
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/cloud/connectors

# Response
{
  "connectors": [
    {
      "id": "cloud_abc123",
      "name": "Production S3",
      "type": "s3",
      "status": "connected"
    }
  ]
}

Aura AI and Models

Method	Endpoint	Description
`GET`	`/api/aura/status`	Get the current status of the local inference server
`GET`	`/api/aura/models/hf`	Scan for HuggingFace-cached models
`GET`	`/api/aura/models/gguf`	Scan for local GGUF model files
`GET`	`/api/aura/models/curated`	Get the curated model list (recommended downloads)
`POST`	`/api/aura/models/download`	Start downloading a model from HuggingFace
`POST`	`/api/aura/models/download/cancel`	Cancel an in-progress model download
`POST`	`/api/aura/models/delete`	Delete a downloaded local model
`GET`	`/api/aura/lan-nodes`	List discovered LAN inference nodes

# Check inference server status
curl -H "Authorization: Bearer $TOKEN" http://localhost:18800/api/aura/status

# Response
{
  "running": true,
  "model": "qwen3-8b-q4_k_m.gguf",
  "port": 8080,
  "gpu_layers": -1
}

Inference Cluster

Method	Endpoint	Description
`GET`	`/api/cluster/status`	Get cluster status (loaded model, workers, running state)
`GET`	`/api/cluster/workers`	List discovered and claimed workers
`POST`	`/api/cluster/workers/{id}/add`	Claim a discovered worker (master-side)
`POST`	`/api/cluster/workers/{id}/remove`	Disconnect a claimed worker (master-side)
`POST`	`/api/cluster/inference/start`	Start distributed inference with full parameters
`POST`	`/api/cluster/inference/stop`	Stop distributed inference
`POST`	`/api/cluster/join`	Accept a master's claim request (worker-side)
`POST`	`/api/cluster/leave`	Leave the cluster (worker-side)
`GET`	`/api/cluster/worker/status`	Report worker status (worker-side)

Billing and Spend

Method	Endpoint	Description
`GET`	`/api/billing/summary`	Get usage summary grouped by provider
`GET`	`/api/billing/limits`	Get spend limits for all providers
`POST`	`/api/billing/limits`	Save a spend limit for a provider
`GET`	`/api/billing/fallback-order`	Get the provider fallback order
`POST`	`/api/billing/fallback-order`	Save the provider fallback order
`GET`	`/api/billing/pricing`	Get model pricing table
`POST`	`/api/billing/pricing`	Save model pricing entries
`POST`	`/api/billing/reset`	Reset all usage tracking data
`GET`	`/api/billing/daily`	Get daily usage statistics
`GET`	`/api/billing/daily-by-model`	Get daily usage broken down by model
`GET`	`/api/routing/stats`	Get smart routing statistics (tasks per tier, cost savings)

# Get billing summary
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/billing/summary

# Response
{
  "today": {
    "total_cost": 2.45,
    "total_input_tokens": 125000,
    "total_output_tokens": 45000
  },
  "month": {
    "total_cost": 38.72,
    "by_provider": {
      "anthropic": {"cost": 28.50, "input_tokens": 1200000, "output_tokens": 450000},
      "openai": {"cost": 8.22, "input_tokens": 500000, "output_tokens": 200000},
      "groq": {"cost": 2.00, "input_tokens": 800000, "output_tokens": 300000}
    }
  }
}

# Set a spend limit
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"provider": "anthropic", "monthly_limit": 50.00, "alert_threshold": 40.00}' \
  http://localhost:18800/api/billing/limits

# Set provider fallback order
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"order": ["anthropic", "openai", "groq", "deepseek"]}' \
  http://localhost:18800/api/billing/fallback-order

# Get daily usage
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/billing/daily

# Response
{
  "daily": [
    {"date": "2026-05-21", "cost": 2.45, "requests": 15},
    {"date": "2026-05-20", "cost": 3.12, "requests": 22},
    {"date": "2026-05-19", "cost": 1.87, "requests": 10}
  ]
}

# Get smart routing statistics
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/routing/stats

# Response
{
  "total_tasks": 150,
  "by_tier": {
    "simple": {"count": 60, "actual_cost": 1.20, "hypothetical_cost": 12.00},
    "standard": {"count": 50, "actual_cost": 8.50, "hypothetical_cost": 15.00},
    "complex": {"count": 30, "actual_cost": 15.00, "hypothetical_cost": 18.00},
    "reasoning": {"count": 10, "actual_cost": 8.00, "hypothetical_cost": 8.00}
  },
  "total_savings": 19.30
}

Voice

Method	Endpoint	Description
`GET`	`/api/voice/voices`	List available TTS voices for the current provider
`POST`	`/api/voice/transcribe`	Transcribe an audio file to text (multipart form data)
`POST`	`/api/voice/save-temp`	Save a temporary audio file from a recording
`POST`	`/api/voice/speak`	Convert text to speech and return audio data

# Transcribe audio
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -F "[email protected]" \
  http://localhost:18800/api/voice/transcribe

# Response
{"text": "Create a REST API for managing user accounts"}

Files

Method	Endpoint	Description
`GET`	`/api/files`	Download a file by path (query parameter: `path`)
`POST`	`/api/files/upload`	Upload a file (multipart form data)
`GET`	`/api/files/list`	List files in a directory

Data Management

Method	Endpoint	Description
`POST`	`/api/data/clear-history`	Clear all conversation and task history
`POST`	`/api/data/reset-keys`	Reset all stored API keys
`POST`	`/api/data/reset-database`	Reset the database to factory defaults
`POST`	`/api/data/clear-model-cache`	Clear cached model metadata
`POST`	`/api/data/reset-all`	Reset all application data

Memory

Method	Endpoint	Description
`GET`	`/api/memories`	List all saved agent memories
`POST`	`/api/memories`	Save a new memory entry
`DELETE`	`/api/memories/{name}`	Delete a specific memory entry

# List all memories
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/memories

# Response
{
  "memories": [
    {
      "name": "user-preferences.md",
      "type": "user",
      "content": "# User Preferences\n- Prefers Python over JavaScript\n- Uses FastAPI for APIs\n- Follows conventional commits",
      "scope": "global",
      "updated_at": "2026-05-21T09:00:00Z"
    },
    {
      "name": "project-context.md",
      "type": "project",
      "content": "# Project Context\n- Uses PostgreSQL on port 5432\n- Test framework: pytest",
      "scope": "project",
      "project_path": "/home/user/myproject",
      "updated_at": "2026-05-20T15:00:00Z"
    }
  ]
}

# Save a new memory
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "team-conventions.md",
    "type": "reference",
    "content": "# Team Conventions\n- All PRs require two approvals\n- Use squash merge only"
  }' \
  http://localhost:18800/api/memories

# Delete a memory
curl -X DELETE -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/memories/team-conventions.md

Projects

Method	Endpoint	Description
`GET`	`/api/projects`	List all projects
`POST`	`/api/projects`	Create a new project
`GET`	`/api/projects/{id}`	Get a specific project
`PUT`	`/api/projects/{id}`	Update a project
`DELETE`	`/api/projects/{id}`	Delete a project
`GET`	`/api/projects/{id}/tasks`	List tasks associated with a project
`GET`	`/api/projects/{id}/memory`	Get project-scoped memory facts

# Create a project
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "My Web App",
    "path": "/home/user/projects/webapp"
  }' \
  http://localhost:18800/api/projects

# Response
{
  "id": "proj_abc123",
  "name": "My Web App",
  "path": "/home/user/projects/webapp",
  "task_count": 0,
  "created_at": "2026-05-21T10:00:00Z"
}

# List tasks for a project
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/projects/proj_abc123/tasks

# Get project-scoped memory facts
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/projects/proj_abc123/memory

# Response
{
  "facts": [
    {
      "category": "knowledge",
      "content": "Project uses PostgreSQL on port 5432",
      "confidence": 0.85,
      "created_at": "2026-05-20T14:00:00Z"
    },
    {
      "category": "file_operation",
      "content": "File modified: src/models/user.py - added email validation",
      "confidence": 0.7,
      "created_at": "2026-05-21T09:30:00Z"
    }
  ]
}

Viewer and Remote Deployment

Method	Endpoint	Description
`GET`	`/api/viewer/status`	Get deployment status including daemon_mode and uptime
`GET`	`/api/viewer/items`	Get deployed items (schedules, listeners, webhooks)
`GET`	`/api/viewer/tasks`	Get recent tasks on the remote deployment
`GET`	`/api/viewer/tasks/{id}/messages`	Get task messages for a remote task
`GET`	`/api/viewer/events`	SSE event stream for real-time remote monitoring
`GET`	`/api/remote-deployments`	List all remote agent deployments

# Get remote deployment status
curl -H "Authorization: Bearer $TOKEN" \
  http://remote-host:18800/api/viewer/status

# Response
{
  "daemon_mode": "full",
  "uptime_seconds": 86400,
  "version": "1.30.1",
  "platform": "linux",
  "active_tasks": 2,
  "active_listeners": 5,
  "active_schedules": 3
}

# List deployed items on a remote agent
curl -H "Authorization: Bearer $TOKEN" \
  http://remote-host:18800/api/viewer/items

# Response
{
  "schedules": [
    {"id": "sched_001", "name": "Daily Report", "enabled": true, "next_run": "2026-05-22T09:00:00Z"}
  ],
  "listeners": [
    {"id": "listen_001", "name": "Slack Bot", "platform": "slack", "status": "online"}
  ],
  "webhooks": [
    {"id": "hook_001", "name": "GitHub Hook", "enabled": true}
  ]
}

# List all remote deployments from the master
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/remote-deployments

# Response
{
  "deployments": [
    {
      "id": "deploy_abc123",
      "hostname": "gpu-server-1.local",
      "mode": "full",
      "status": "online",
      "last_heartbeat": "2026-05-21T10:29:30Z",
      "uptime_seconds": 172800
    },
    {
      "id": "deploy_def456",
      "hostname": "inference-node-2.local",
      "mode": "inference-worker",
      "status": "online",
      "last_heartbeat": "2026-05-21T10:29:45Z",
      "uptime_seconds": 86400
    }
  ]
}

Slash Commands

Method	Endpoint	Description
`GET`	`/api/slash-commands`	List all custom slash commands
`POST`	`/api/slash-commands`	Create a new slash command
`GET`	`/api/slash-commands/{id}`	Get a specific slash command
`PUT`	`/api/slash-commands/{id}`	Update a slash command
`DELETE`	`/api/slash-commands/{id}`	Delete a slash command

Custom Providers

Method	Endpoint	Description
`GET`	`/api/providers/custom`	List custom providers
`POST`	`/api/providers/custom`	Save (create or update) a custom provider
`DELETE`	`/api/providers/custom/{id}`	Delete a custom provider

# Add a custom provider
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "My vLLM Server",
    "base_url": "https://my-vllm.example.com/v1",
    "api_key": "my-api-key",
    "api_format": "openai",
    "auth_type": "bearer"
  }' \
  http://localhost:18800/api/providers/custom

# List custom providers
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/providers/custom

# Create a slash command
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "summarize",
    "description": "Summarize the current conversation or document",
    "prompt": "Summarize the following content in 3-5 bullet points: {{input}}",
    "handler_type": "agent_task"
  }' \
  http://localhost:18800/api/slash-commands

# List slash commands
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/slash-commands

# Response
{
  "commands": [
    {
      "id": "cmd_abc123",
      "name": "summarize",
      "description": "Summarize the current conversation or document",
      "handler_type": "agent_task",
      "enabled": true
    }
  ]
}

Other Endpoints

Method	Endpoint	Description
`GET`	`/api/environment`	Get system environment information (OS, architecture, memory, etc.)
`GET`	`/api/diagnostics`	Run system diagnostics and return results
`GET`	`/api/web-server/status`	Get the web server's current status and configuration
`GET`	`/api/health`	Health check (always public, returns `{"status":"ok"}`)
`POST`	`/api/heartbeat/incoming`	Receive heartbeat from remote deployment nodes (always public)
`GET`	`/api/image-proxy`	Proxy external images (for PDF export and CORS bypass)
`GET`	`/api/dependencies/check`	Check status of system dependencies (Docker, Node.js, Python, etc.)
`GET`	`/api/design-systems`	List available design system tokens and themes
`GET`	`/charts/{filename}`	Serve generated chart image files

WebSocket Endpoints

Path	Description
`/ws/browser`	Chrome extension browser automation commands and responses
`/ws/sidepanel`	Chrome extension side panel chat with real-time streaming

API Compatibility Layer

The aura-inference local inference server exposes multiple compatibility APIs, allowing it to act as a drop-in replacement for popular inference services.

OpenAI-Compatible API

Method	Path	Description
`POST`	`/v1/chat/completions`	Chat completion (streaming and non-streaming)
`POST`	`/v1/completions`	Text completion
`POST`	`/v1/embeddings`	Text embeddings
`GET`	`/v1/models`	List available models

# Chat completion (OpenAI format)
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "local-model",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": true
  }'

Anthropic-Compatible API

Method	Path	Description
`POST`	`/v1/messages`	Anthropic Messages API format

Ollama-Compatible API

Method	Path	Description
`POST`	`/api/chat`	Ollama chat format
`POST`	`/api/generate`	Ollama generate format
`GET`	`/api/tags`	List available models (Ollama format)
`GET`	`/api/ps`	List running models
`POST`	`/api/show`	Show model information

API Gateway

Aura Workshop also exposes an API gateway that allows external tools to use it as a proxy for LLM requests. Point any OpenAI or Anthropic SDK at your Aura Workshop instance and it will route requests to whatever provider and model you have configured, applying spend limits, fallback, and logging automatically. This is particularly useful for:

Centralizing LLM access through a single endpoint for your team
Applying spend limits and usage tracking to all requests
Using local models with tools that only support OpenAI/Anthropic APIs
Testing different models by changing the server-side configuration without modifying client code

Method	Path	Description
`POST`	`/v1/chat/completions`	OpenAI-compatible chat completion (proxied to configured provider)
`POST`	`/v1/messages`	Anthropic-compatible messages (proxied to configured provider)
`GET`	`/v1/models`	List available models from all configured providers

License

Method	Endpoint	Description
`POST`	`/api/license/validate`	Validate and activate a license key
`GET`	`/api/license/status`	Get the current license status and tier

aura-cli (Go TUI)

Aura Workshop includes a standalone Go binary CLI (aura-cli v1.0.0) that provides a terminal user interface built with Bubbletea. The CLI connects to an Aura Workshop server instance for inference.

Usage

aura-cli [flags]

Flags

Flag	Long Form	Description	Default
`-s`	`--server`	Server URL to connect to	`http://localhost:18800`
`-m`	`--model`	Model to use for inference	(server default)
`-t`	`--token`	Authentication token for the server	(none)
	`--password`	Password for authentication	(none)
`-p`	`--project`	Project path to attach as the working directory	(current directory)
	`--chat`	Chat mode: no tool access, conversation only	(agent mode)
`-v`	`--verbose`	Verbose output for debugging	off
	`--version`	Print version information and exit

Modes

Agent mode (default): Full tool access. The agent can read/write files, run commands, search the web, and use all configured tools.
Chat mode (--chat): Conversational only. No tools are available. Ideal for quick Q&A without giving the agent access to your system.

Pipe Mode

The CLI supports piping input via stdin for non-interactive use. This is useful for scripting and CI/CD integration:

# Pipe a prompt directly
echo "Explain this error" | aura-cli -s http://my-server:18800 -t my-token

# Pipe a file as context
cat error.log | aura-cli -m claude-sonnet-4-20250514 -t my-token

# Use in shell scripts
git diff HEAD~1 | aura-cli --chat -m claude-sonnet-4-20250514 -t my-token \
  -s http://my-server:18800

In pipe mode, the CLI reads stdin until EOF, sends the content as the user message, and outputs the agent's response to stdout. This makes it easy to integrate Aura Workshop into existing command-line workflows and automation scripts.

Configuration File

Default settings can be stored in ~/.aura/config.json:

{
  "server": "http://localhost:18800",
  "token": "my-auth-token",
  "model": "claude-sonnet-4-20250514"
}

Examples

# Connect to a remote server with auth
aura-cli -s http://my-server:18800 -t my-secret-token

# Use a specific model in chat mode
aura-cli -m claude-sonnet-4-20250514 --chat

# Attach a project directory
aura-cli -p /path/to/my/project

aura-inference CLI

The aura-inference binary can be used standalone from the command line for local model serving.

Serve Mode

aura-inference serve --model <path-to-model.gguf> --port 8080 [options]

Option	Description	Default
`--model`	Path to the GGUF model file	(required)
`--port`	HTTP server port	8080
`--gpu-layers`	Number of layers to offload to GPU (-1 = all)	-1
`--quantization`	Quantization type (Q4_K_M, Q5_K_M, Q8_0, F16)	Q4_K_M
`--ctx-size`	Context window size in tokens	4096
`--batch-size`	Inference batch size	512
`--flash-attn`	Enable flash attention	disabled
`--cache-type-k`	KV cache key type (q8_0, f16)	q8_0
`--cache-type-v`	KV cache value type (q8_0, f16)	q8_0
`--thinking`	Enable thinking/reasoning mode	disabled
`--rpc-workers`	Comma-separated list of RPC workers (host:port)	(none)

aura-inference serve \
  --model ~/.cache/aura-inference/models/qwen3-8b-q4_k_m.gguf \
  --port 8080 \
  --gpu-layers -1 \
  --ctx-size 8192 \
  --batch-size 1024 \
  --flash-attn \
  --thinking

RPC Worker Mode

aura-inference rpc --host 0.0.0.0 --port 50052

Starts the binary in RPC worker mode, contributing GPU resources to a master node. The master specifies workers via the --rpc-workers flag with a comma-separated list of host:port pairs:

# Master with two RPC workers
aura-inference serve \
  --model ~/.cache/aura-inference/models/llama-3.3-70b-q4_k_m.gguf \
  --port 8080 \
  --gpu-layers -1 \
  --rpc-workers worker1.local:50052,worker2.local:50052

Each worker contributes its GPU VRAM to the inference cluster. The master distributes model layers proportionally across all available GPUs (local plus remote workers). This enables running models that are too large for any single GPU.

aura-daemon CLI

The daemon binary is used inside Docker containers for headless operation.

aura-daemon --mode <mode>

Mode	Description
`full`	Complete daemon with all features enabled
`inference-master`	Inference cluster master node
`inference-worker`	Inference cluster worker node
`worker`	Task execution worker

The daemon reads all its configuration from environment variables. See Docker Daemon for the full environment variable reference.

Database Schema

Aura Workshop uses a single SQLite database in WAL (Write-Ahead Logging) mode with 35+ tables. The schema is created and migrated automatically on launch.

Table	Purpose
`settings`	Application configuration stored as key-value pairs
`conversations`	Chat conversation metadata (title, timestamps)
`messages`	Individual messages within conversations (role, content, timestamps)
`tasks`	Agent task metadata: status, model, project path, role, timestamps, classification
`task_messages`	Messages within agent tasks including tool calls and results
`task_memory`	Role handoff data, compaction summaries, and checkpoint state
`memory_facts`	LLM-extracted structured facts with category, confidence, scope, and content
`scheduled_tasks`	Schedule definitions: type, time, cron expression, duration, enabled state
`listeners`	Listener configurations: platform, token, trigger type, rules, prompt
`listener_events`	Event logs for listener activity: timestamp, message, user, response
`webhooks`	Webhook endpoint configurations: name, secret, prompt, model
`webhook_logs`	Invocation logs for webhooks: timestamp, request body, response
`workflows`	DAG workflow definitions: nodes, edges, metadata
`workflow_runs`	Workflow execution state: per-step status, timestamps, data context
`mcp_servers`	MCP server configurations: name, transport, command/URL, isolation mode
`credentials`	AES-256-GCM encrypted credential values with name and type metadata
`credential_pool`	Shared credential pools for team-level credential sharing
`custom_providers`	User-defined LLM provider endpoints: name, URL, API format, auth type
`spend_limits`	Per-provider monthly spend caps
`token_usage`	Per-request token usage and cost tracking: model, provider, input/output tokens, cost
`cloud_connectors`	AWS S3, GCS, Azure Blob connector configurations
`teams`	Multi-agent team definitions: name, roles, workflow type
`skill_settings`	Per-skill configuration overrides
`projects`	Project definitions: name, path, metadata
`plugins`	Plugin configurations: name, version, enabled state, settings
`slash_commands`	Custom slash command definitions: name, description, prompt, handler type
`design_systems`	Design system token definitions for design skills
`remote_deployments`	Remote agent deployment records: host, mode, status, pairing code
`task_files`	Files created or modified by agent tasks: path, operation, task ID
`routing_logs`	Smart routing decision logs: task ID, tier, model, score
`provider_health`	Provider health tracking: latency, error count, circuit breaker state
`ollama_models`	Cached Ollama model metadata
`aura_models`	Local GGUF model metadata from scanning
`model_pricing`	Input/output pricing per million tokens for each model
`fallback_order`	Provider fallback priority sequence

Network Ports Reference

Port	Protocol	Service	Notes
18800	TCP	Web UI + REST API	Main application port. Serves the SolidJS frontend and all REST API endpoints.
18801	UDP	LAN inference discovery	Broadcast/listen for inference cluster nodes. 30-second interval, 90-second expiry.
18790	TCP	Webhook receiver	Incoming webhook payloads from external services.
8080	TCP	Aura AI local inference	Local GGUF model inference server. Configurable port.
11434	TCP	Ollama	Default Ollama server port. Auto-detected by Aura Workshop.
50052	TCP	RPC (distributed inference)	Worker-to-master RPC for distributed model layer inference.
1420	TCP	Vite dev server	Development only. Used when running the frontend in development mode.

Data Storage Locations

Data	Location
Database (macOS)	`~/Library/Application Support/aura-workshop/aura-workshop.db`
Database (Linux)	`~/.local/share/aura-workshop/aura-workshop.db`
Database (Windows)	`%APPDATA%\aura-workshop\aura-workshop.db`
Database (Docker)	`/data/aura-workshop.db`
User memory	`~/.aura/memory/`
Custom roles	`~/.aura/roles/`
GGUF models	`~/.cache/aura-inference/models/`
HuggingFace cache	`~/.cache/huggingface/hub/`
Project memory	`.aura/memory/` (relative to project root)
Project instructions	`AURA.md` or `CLAUDE.md` (in project root)
CLI config	`~/.aura/config.json`
Skills	Bundled in app resources; custom skills in the database

License Tiers

Tier	Features
Community	Core features: single-agent tasks, basic automation (schedules, listeners, webhooks), local inference via Aura AI and Ollama, all built-in tools, all skills, web UI access.
Professional	Everything in Community plus: multi-agent teams, advanced automation rules, all 31 listener platforms, smart routing, spend tracking and limits.
Business	Everything in Professional plus: visual DAG workflow editor, remote agent deployment, inference cluster management, advanced merge strategies.
Enterprise	Everything in Business plus: credential pools, priority support, custom integrations, SLA guarantees.

The application works fully in Community mode without a license key. Enter a license key in Settings > Security to unlock higher tiers.

License Activation

To activate a license:

Go to Settings > Security.
Enter your license key in the License Key field.
Click Activate. The system validates the key against the license server.
On successful validation, the license tier and expiration date are displayed.
Features for the new tier become available immediately without restart.

License validation can also be performed via the REST API: POST /api/license/validate with the key in the request body. Check current license status at any time with GET /api/license/status.

Feature Comparison

Feature	Community	Professional	Business	Enterprise
Single-agent tasks	Yes	Yes	Yes	Yes
Built-in tools (50+)	Yes	Yes	Yes	Yes
All skills (100+)	Yes	Yes	Yes	Yes
Local inference (Aura AI)	Yes	Yes	Yes	Yes
Web UI access	Yes	Yes	Yes	Yes
Basic automation	Yes	Yes	Yes	Yes
Multi-agent teams	--	Yes	Yes	Yes
Smart routing	--	Yes	Yes	Yes
All 31 listener platforms	--	Yes	Yes	Yes
Spend tracking & limits	--	Yes	Yes	Yes
Visual DAG workflows	--	--	Yes	Yes
Remote deployment	--	--	Yes	Yes
Inference cluster	--	--	Yes	Yes
Credential pools	--	--	--	Yes
Priority support	--	--	--	Yes

Troubleshooting & FAQ

macOS quarantine prevents launch

After installing from DMG, macOS may quarantine the application. You will see a message like "Aura Workshop cannot be opened because it is from an unidentified developer." Open Terminal and run:

xattr -cr /Applications/Aura\ Workshop.app

Then launch the app again from Applications or Spotlight.

No models appear in the model selector

Verify you have configured at least one provider with a valid API key on the Models page.
If using Ollama, ensure it is running (ollama serve) on localhost:11434.
If using Aura AI, click Scan to detect models and then click Launch to start the inference server.
Check that the provider's API key has not expired or been revoked.

Task stuck in "Executing" state

Click the Stop button to cancel the running task.
Check the model configuration: the selected model may be unavailable or the API key may be invalid.
If stop does not respond immediately, the task may be waiting on a long-running tool operation (browser actions time out at 45 seconds, team/workflow tasks at 30 minutes).
Resume the task from the Dashboard after fixing the underlying issue.

Web UI not accessible from another machine

Ensure the web server is enabled in Settings > Connectivity.
Check that port 18800 (or your configured port) is not blocked by a firewall.
If using authentication, include the Bearer token in your requests.
Verify the server is running: curl http://localhost:18800/api/health should return {"status":"ok"}.
Ensure the machine's network allows incoming connections on the configured port.

Context window errors or truncation

Context compression triggers automatically at 70% of the model's context window. For very long tasks, the system compresses older messages into a summary.
Try using a model with a larger context window (such as Gemini 2.0 with 1M tokens).
Break complex tasks into smaller, focused sub-tasks to reduce context accumulation.

Docker mode not available

Docker must be installed and running. Aura Workshop auto-detects Docker on startup.
If Docker is not detected, the app defaults to native mode automatically.
You can manually toggle native mode in Settings > General.

Local inference not starting

Ensure you have downloaded at least one GGUF model via the HuggingFace Downloader.
Check that the model file exists in ~/.cache/aura-inference/models/.
Verify the GPU layer count is appropriate for your hardware. If your GPU has insufficient VRAM, reduce the number of GPU layers.
On macOS, Metal acceleration is used automatically. On Linux, CUDA requires an NVIDIA GPU with current drivers.
Check that port 8080 (or your configured port) is not already in use by another application.

Inference cluster workers not discovered

Ensure all machines are on the same LAN subnet.
Verify UDP port 18801 is not blocked by any firewall on either the master or worker machines.
Docker deployments must use --net=host for UDP broadcast to function correctly.
Workers expire after 90 seconds without broadcasting. Check that the worker process is still running.
Confirm that lan_discovery_enabled is set to true in settings on both master and worker.

MCP server connection failures

For stdio transport: verify the command path is correct, the binary exists, and it is executable.
For HTTP transport: verify the URL is reachable with a curl test.
Check MCP server logs for startup errors or version incompatibilities.
Some MCP servers require specific Node.js or Python versions. Ensure the correct runtime is installed.

Agent blocked from running a command

The safety guardrails block dangerous commands automatically. The error message explains why the command was blocked.
The elevatedBash setting allows sudo commands but does not bypass the safety guardrails.
If you believe a command was blocked incorrectly, check whether it matches any of the blocked patterns (rm -rf /, fork bombs, mkfs, dd on block devices).

Provider rate limit errors (HTTP 429)

Configure the provider fallback order in Settings > Billing so the system can automatically switch to another provider.
Set spend limits to control costs and prevent unexpected charges.
Some providers with free tiers (Groq, z.ai) have strict rate limits. Consider using them as secondary fallback providers rather than primary.

Database corruption or issues

Use Settings > Data > Diagnostics to check database health and integrity.
The database uses WAL mode, which is resilient to most crash scenarios.
As a last resort, use Settings > Data > Reset Database to start fresh (this deletes all data permanently).

How do I use Aura Workshop without any internet connection?

Download one or more GGUF models via the HuggingFace Downloader while you have internet access. Then use the Aura AI local inference engine to run those models entirely on your hardware. No internet connection is required for local inference, and all data stays on your machine.

Listener not receiving messages from a platform

Verify the authentication token is correct and has not expired.
For Slack: ensure the bot has been invited to the channel it is monitoring.
For Discord: verify the bot has the necessary permissions (Read Messages, Send Messages) in the target server and channel.
For Telegram: confirm the bot token is valid using the Telegram Bot API (https://api.telegram.org/bot<token>/getMe).
For WhatsApp: ensure you have completed the Meta Business verification and the WhatsApp Business API is properly configured.
Check the listener's trigger rules: if set to "mentions" mode, messages without the bot mention will be ignored.
View the listener's event logs for errors: go to Listeners > View Logs on the specific listener.

Webhook not triggering tasks

Verify the webhook URL is correct and accessible from the sending service. Test with cURL.
If using HMAC-SHA256 validation, ensure the sending service is computing the signature correctly against the shared secret.
Check that webhooks are served on port 18790 (separate from the main web UI port 18800).
View webhook invocation logs for errors: go to Webhooks > View Logs on the specific webhook.
Ensure the webhook is enabled (the toggle switch should be on).

Schedule not running at the expected time

Schedule times are in the local timezone of the machine running Aura Workshop.
For Docker deployments, the container timezone may differ from your local timezone. Set the TZ environment variable.
Verify the schedule is enabled (toggle switch is on).
Check that the duration type has not expired (for "repeat_until" schedules).
If no model is configured for the schedule, it uses the globally selected model which must be available.

Voice input not working

Ensure microphone permissions are granted to the application.
On macOS, check System Settings > Privacy & Security > Microphone.
Verify the speech-to-text provider is configured in Settings > Integrations.
If using a cloud provider (Whisper, Groq, OpenAI), ensure the corresponding API key is set.
The System provider uses the OS built-in speech recognition which requires no API key.

Smart routing not saving money as expected

Review the tier boundaries in Settings > Routing. The default boundaries may not match your workload.
Check the routing analytics to see which tiers are being used most.
If most tasks are being routed to Complex or Reasoning tiers, consider adjusting the boundary scores to be more aggressive.
Ensure you have configured cheaper models for the Simple and Standard tiers.

Memory facts not being extracted

Memory extraction happens after task completion, not during execution.
Very short conversations (single-turn Q&A) may not produce extractable facts.
Check Settings > Memory > Learned Facts to see what has been extracted.
Corrections and reinforcements require specific trigger phrases ("no", "wrong", "instead" for corrections; "yes", "perfect", "great" for reinforcements).

AURA.md not being detected

The file must be named exactly AURA.md, .aura.md, CLAUDE.md, .claude.md, or .aura/INSTRUCTIONS.md.
The file must be in the project root or up to 2 parent directories above it.
File size is capped at 4 KB per file, 12 KB total. Larger files are truncated.
Ensure the task has a project path set so the system knows where to look.

How do I update to the latest version?

Go to Settings > Updates and click "Check for Updates". If a new version is available, click "Download and Install". The application downloads the update, verifies its integrity, and restarts automatically. For Docker deployments, pull the latest image: docker pull coolkoo/aura-workshop:daemon-latest.

Can I use Aura Workshop as an API server for other applications?

Yes. The API Gateway feature exposes OpenAI-compatible and Anthropic-compatible endpoints at /v1/chat/completions, /v1/messages, and /v1/models. Point any SDK or tool at your Aura Workshop instance and it will proxy requests to your configured provider with automatic fallback, spend limits, and logging.

How do I back up my Aura Workshop data?

All application data is stored in a single SQLite database file. To create a backup:

Locate the database file for your platform (see Data Storage Locations).
Copy the aura-workshop.db file to a safe location. Because the database uses WAL mode, also copy aura-workshop.db-wal and aura-workshop.db-shm if they exist.
Optionally back up the ~/.aura/ directory to preserve user memories, custom roles, and CLI configuration.
For Docker deployments, the /data volume mount contains the database.

How do I migrate Aura Workshop to a new machine?

Install Aura Workshop on the new machine.
Copy the database file from the old machine to the appropriate location on the new machine.
Copy the ~/.aura/ directory for memories and roles.
Copy the ~/.cache/aura-inference/models/ directory if you want to keep downloaded GGUF models.
Launch Aura Workshop on the new machine. It will read the existing database and restore your configuration.
Note: credential encryption keys are stored in the OS keychain and cannot be migrated. You will need to re-enter API keys and credentials on the new machine.

Can multiple users share one Aura Workshop instance?

Yes. The web UI can be accessed by multiple users simultaneously. Each user creates their own tasks, and all tasks are visible to all users. For multi-user environments, configure a Bearer token for authentication. Note that there is no per-user access control in the current version; all authenticated users have full access to all features and data.

What happens when the context window fills up?

When the conversation context reaches 70% of the model's maximum context window, automatic compression kicks in. The system summarizes older messages into a condensed form, preserving the most important information while freeing up context space. This process is transparent and the task continues without interruption. The context usage percentage is visible in the Context Panel on the right sidebar.

How do parallel agents handle file conflicts?

When multiple parallel agents modify the same file, the merge executor resolves conflicts using the configured strategy: CopyOnWrite (creates side-by-side copies, safest option), LastWins (last agent's version prevails), or LLMResolve (an LLM intelligently merges the changes). The merge also handles dependency files (package.json, Cargo.toml, requirements.txt) with smart dependency merging that combines packages from all agents.

Supported Providers Summary

Provider	API Format	Auth Type	Free Tier
Anthropic	Anthropic native	API key header	No
OpenAI	OpenAI	Bearer token	No
Google	Google Gemini native	Query parameter	Yes (Flash, Flash-Lite)
MiniMax	OpenAI-compatible	Bearer token	No
DeepSeek	OpenAI-compatible	Bearer token	No
Mistral AI	OpenAI-compatible	Bearer token	No
Zhipu AI (z.ai)	OpenAI-compatible	Bearer token	Yes
Xiaomi / MiMo	OpenAI-compatible	Bearer token	No
Moonshot / Kimi	OpenAI-compatible	Bearer token	No
OpenRouter	OpenAI-compatible	Bearer token	Yes (many free models)
Together AI	OpenAI-compatible	Bearer token	No
Groq	OpenAI-compatible	Bearer token	Yes (rate-limited)
SiliconFlow	OpenAI-compatible	Bearer token	No
Ollama	OpenAI-compatible	None	Free (local)
LM Studio	OpenAI-compatible	None	Free (local)
LocalAI	OpenAI-compatible	None	Free (local)
vLLM	OpenAI-compatible	None	Free (local)
TGI	OpenAI-compatible	None	Free (local)
SGLang	OpenAI-compatible	None	Free (local)
Aura AI	OpenAI-compatible	None	Free (bundled)