Aura Workshop v1.30.1 Documentation
Welcome to the complete reference for Aura Workshop, a model-agnostic AI agent orchestration platform built for individuals and teams who want full control over their AI workflows. This documentation covers every feature from first install through advanced multi-agent teams, 31-platform automation, distributed GPU inference clusters, visual DAG workflows, remote agent deployment, the full REST API with approximately 140 endpoints, 50+ agent tools, 100+ skills, and the command-line interface. Whether you are running a quick question-and-answer session or orchestrating a fleet of agents across multiple machines, this guide has you covered.
Installation
System Requirements
Before installing, confirm your system meets the minimum requirements for your platform.
| Platform | Minimum Requirements |
|---|---|
| macOS | macOS 12 (Monterey) or later. Xcode Command Line Tools are recommended for full functionality. Apple Silicon (M-series) provides Metal GPU acceleration for local inference. |
| Windows | Windows 10 64-bit or later. An NVIDIA GPU with current drivers is recommended for local inference with CUDA support. |
| Linux | A modern 64-bit distribution with webkit2gtk 4.1 and libayatana-appindicator (or their equivalents). NVIDIA drivers and CUDA toolkit are recommended for GPU inference. |
Install from Package
macOS
Download the .dmg installer from the Downloads page. Open the DMG file and drag the Aura Workshop icon into your Applications folder. Because the app is distributed outside the Mac App Store, macOS may quarantine it on first launch. If you see a security warning, open Terminal and run the following command to clear the quarantine flag:
xattr -cr /Applications/Aura\ Workshop.app
After running this command, launch the app normally from Applications or Spotlight.
Windows
Download the NSIS installer .exe from the Downloads page. Run the installer and follow the standard installation wizard prompts. Choose your installation directory (the default is C:\Program Files\Aura Workshop), click Install, and wait for the process to complete. Launch Aura Workshop from the Start Menu or desktop shortcut.
Linux
Two package formats are available:
- .deb package (Debian, Ubuntu, and derivatives): Install using dpkg:
If dependency errors occur, follow up withsudo dpkg -i aura-workshop_1.30.1_amd64.debsudo apt-get install -fto resolve them. - AppImage (any distribution): Mark the file as executable and run it directly:
No installation is required. The AppImage is self-contained and portable.chmod +x Aura_Workshop-1.30.1.AppImage ./Aura_Workshop-1.30.1.AppImage
Docker
For headless server deployments, pull the pre-built Docker image from Docker Hub:
docker pull coolkoo/aura-workshop:daemon-latest
See the Docker Daemon section for full setup instructions including environment variables, volume mounts, and GPU passthrough.
Build from Source
Building from source requires Node.js 18 or later and a Rust toolchain installed via rustup. You also need the platform-specific Tauri dependencies documented in the Tauri v2 prerequisites guide.
# Clone the repository
git clone https://github.com/coolkoo/aura-workshop.git
cd aura-workshop
# Install JavaScript dependencies
npm install
# Build the desktop application
npm run tauri build
You can target a specific package format by passing a bundles flag:
npm run tauri build -- --bundles dmg # macOS DMG disk image
npm run tauri build -- --bundles nsis # Windows NSIS installer
npm run tauri build -- --bundles deb # Linux .deb package
npm run tauri build -- --bundles appimage # Linux AppImage
The compiled binary and installer will appear in src-tauri/target/release/bundle/.
First Launch & Setup Wizard
The very first time you open Aura Workshop, a five-step setup wizard guides you through initial configuration. Each step can be skipped and revisited later from the Settings page.
Step 1: Welcome
An introductory screen that explains the core capabilities of the platform. Click Next to proceed.
Step 2: License Key Entry (Optional)
If you have purchased a license, enter the key here to unlock Professional, Business, or Enterprise features. If you do not have a key, click Skip to continue in Community mode. Community mode provides full access to single-agent tasks, basic automation, local inference, and core tools. You can enter or upgrade your license at any time from Settings.
Step 3: Provider Configuration
Select your preferred AI provider from a list that includes Anthropic, OpenAI, Google, and many more. Enter your API key for the chosen provider. The wizard includes a Test Connection button that verifies the key is valid and the provider is reachable before saving. You can configure additional providers later from the Models page.
Step 4: Local Model Download (Optional)
If you want to run models locally without any cloud dependency, this step offers a curated list of GGUF models you can download directly. The models range from compact 5 GB options to large 42 GB models for maximum quality. Downloads show real-time progress, speed, and estimated time remaining. Skip this step if you plan to use only cloud providers.
Step 5: Ready
Setup is complete. You are taken directly to the main Dashboard where you can begin your first task.
Behind the Scenes on First Launch
In addition to the wizard, the application performs several initialization steps automatically:
- Creates a local SQLite database in WAL (Write-Ahead Logging) mode with 35+ tables covering settings, conversations, tasks, teams, workflows, billing, credentials, and more.
- Detects whether Docker is installed and available. If Docker is not found, the application enables native mode automatically so that agent commands run directly on your operating system.
- Installs all bundled skills across document, Anthropic, superpowers, desktop apps, design, platform, and media categories.
- Initializes the credential encryption system with an AES-256-GCM encryption key stored securely in the operating system keychain (macOS Keychain, Windows Credential Manager, or Linux secret service).
- Starts the embedded Web UI server on port 18800.
- Auto-starts any listeners that were previously enabled in a prior session.
Database Locations
| Platform | Path |
|---|---|
| macOS | ~/Library/Application Support/aura-workshop/aura-workshop.db |
| Windows | %APPDATA%\aura-workshop\aura-workshop.db |
| Linux | ~/.local/share/aura-workshop/aura-workshop.db |
| Docker | /data/aura-workshop.db (via volume mount) |
Application Layout
The Aura Workshop interface centers on a left sidebar with an icon rail for primary navigation, a scrollable middle section for project and task organization, and a main content area that fills the rest of the screen.
Left Sidebar Icon Rail
The icon rail runs vertically along the left edge and is always visible. The top section contains fixed navigation icons:
| Icon | Label | View |
|---|---|---|
| Grid | Dashboard | Home screen with task history, quick actions, and notifications |
| Robot | Agents | Primary task workspace for all AI interactions |
| Headset | Listeners | Messaging platform listeners and automation triggers |
| Link | Webhooks | HTTP webhook endpoint management |
| Clock | Schedulers | Scheduled task definitions and controls |
Scrollable Middle Section
Below the primary navigation icons, the sidebar has two collapsible sections:
- Projects: A collapsible list of your projects. Each project shows a name and a badge with the count of associated tasks. Click the + button next to the Projects header to create a new project. Projects support drag-and-drop: drag a task from the task list onto a project name to associate them. Click the expand/collapse arrow to toggle the project list.
- Tasks: A searchable, scrollable list of all tasks. Each task displays its title and a colored status dot indicating its current state (green for completed, blue for executing, red for failed, yellow for interrupted, gray for waiting, purple for planning). Use the search field above the task list to filter by title. Click any task to open it in the main content area.
Bottom Fixed Section
Three icons are pinned to the bottom of the sidebar and are always accessible regardless of scroll position:
- Settings (gear icon): Opens the multi-tab Settings panel.
- Report Bug: Opens a bug report dialog where you can describe issues and submit feedback.
- Help / Docs: Opens documentation and help resources.
Sidebar Behavior
The sidebar is collapsible. Click the collapse button or drag the resize handle at the sidebar's right edge to adjust its width. On mobile and narrow screens, the sidebar collapses into a hamburger menu accessible from the top-left corner. Tapping the hamburger icon reveals the full sidebar as a slide-over overlay.
Dashboard Overview
The Dashboard is the home screen that greets you each time you open Aura Workshop. It provides a high-level view of your activity and offers quick pathways to start new tasks.
Greeting
At the top of the Dashboard, a randomized greeting message appears (for example, "Good afternoon" or "Welcome back"). This text rotates on each visit to keep the experience fresh.
Dependencies Banner
If the system detects that required dependencies are missing (such as Docker, Node.js, or Python), a prominent banner appears at the top of the Dashboard with a description of what is missing and a direct link to Settings where you can resolve the issue. This banner disappears once all dependencies are satisfied.
Quick Action Cards
Below the greeting, the Dashboard presents a grid of Quick Action Cards organized into eight categories. Each card contains a brief description and a one-click prompt that creates a new task pre-filled with a relevant starting instruction.
| Category | Example Prompts |
|---|---|
| Software Dev | Build a REST API, set up a CI/CD pipeline, create a system design document, analyze a dataset |
| Marketing | Draft a content calendar, create an email campaign, run a competitive analysis |
| Finance | Build a financial model, create a P&L analysis, design a budget framework |
| Legal | Review a contract, draft an NDA, generate a compliance checklist |
| Research | Conduct a literature review, design an A/B test, analyze survey responses |
| Design | Create a design system, perform an accessibility audit, write user stories |
| Operations | Write a runbook, design an onboarding workflow, create an SLA framework |
| Education | Build a course curriculum, write a tutorial, design a grading rubric |
Clicking any card immediately creates a new task and begins execution with the selected prompt.
Recent Tasks
The main body of the Dashboard displays a grid of recent tasks. Each task card shows:
- Title: The first line or auto-generated summary of the task prompt.
- Status badge: A color-coded label indicating the task's current state:
- EXECUTING (blue): The agent is actively working on the task.
- COMPLETED (green): The task finished successfully.
- FAILED (red): The task encountered an unrecoverable error.
- INTERRUPTED (yellow): The task was stopped by the user or by a crash and can be resumed.
- WAITING (gray): The task is queued and has not yet started execution.
- PLANNING (purple): The task is in planning mode, generating an execution plan before running.
- Time ago: A relative timestamp showing when the task was created or last updated (for example, "2 hours ago").
- Quick actions: Each task card has action buttons that appear on hover or tap:
- Run / Resume: Re-execute a completed task or resume an interrupted one.
- Delete: Remove the task and all its messages permanently (with confirmation).
- Fork: Create a duplicate of the task with the same context and conversation history for trying a different approach.
Completed Task Notifications
When a task finishes while you are viewing the Dashboard or another screen, a toast notification slides in from the corner. The notification shows the task title and its final status. These notifications auto-dismiss after 8 seconds, or you can click them to navigate directly to the completed task.
Task Composer
At the bottom of the Dashboard (and indeed at the bottom of every screen), a persistent task composer bar is always available. This is the single entry point for all interactions with the AI. The composer includes the following controls:
- Textarea: The main text input where you type your task description, question, or follow-up message. Press Enter to send or Shift+Enter for a new line.
- Files button: Opens a multi-select file picker allowing you to attach documents, images, or other files as context for the task. Supported image formats (JPEG, PNG, GIF, WebP) are sent as vision inputs to multimodal models. Other file types are read as text.
- Folder button: Opens a directory picker to mount a project folder. The selected folder becomes the agent's working directory, and file operations are scoped to that path.
- Tools button: Opens the MCP tool picker, which displays all available tools grouped by their MCP server. Each tool shows its name and a brief description. Toggle individual tools on or off to control which capabilities the agent can use for this task.
- Voice button: Activates the microphone for speech-to-text input. When recording, a pulsing red dot appears on the button. Speak your task description and the audio is transcribed and inserted into the textarea automatically.
- Plan/Execute toggle: A clipboard icon that toggles between Plan mode and Execute mode. When Plan mode is active (the icon turns cyan), the agent generates a detailed execution plan before taking any action. When Plan mode is off, the agent proceeds directly to execution.
- Role picker: A dropdown that lets you assign a specific role to the task. The picker includes a search field to filter roles, a category filter, and a "Default (no role)" option at the top. Selecting a role applies that role's system prompt and tool permissions to the task.
- Project selector: A dropdown listing all your projects. Selecting a project associates the new task with that project and ensures project-scoped memory is available.
- Thinking mode toggle: When the selected model supports extended reasoning, this control appears with four levels: Off, Low, Medium, and High. Higher thinking levels cause the model to reason more deeply before responding, producing more thorough but slower output.
- Model selector: A dropdown listing all configured and available models, grouped by provider. Select a model to use for the current task. The selected model persists as the default for subsequent tasks.
- Send button: A paper plane icon that submits the task. Alternatively, press Enter when the textarea is focused.
Starting a Task
There is no separate "Chat Mode" or "Agent Mode" in Aura Workshop. Everything flows through the single unified input described above. To start a task:
- Type your request in the input textarea at the bottom of the screen.
- Optionally click Files to attach one or more files for context.
- Optionally click Folder to mount a project directory as the agent's working path.
- Optionally click Tools to open the MCP tool picker and enable or disable specific tools. Tools are grouped by their MCP server name, and each shows the tool name plus a brief description.
- Optionally click Voice to record a spoken prompt. A pulsing red dot indicates active recording.
- Optionally toggle Plan/Execute (clipboard icon, cyan when Plan is on) to have the agent create an execution plan before acting.
- Optionally select a Role from the role picker. The picker includes search, category filtering, and a "Default (no role)" option.
- Optionally set Thinking mode (Off / Low / Medium / High) when the selected model supports it.
- Optionally choose a different model from the model selector dropdown.
- Press Enter or click the Send button (paper plane icon).
The system automatically classifies your input and routes it to the appropriate execution path. You never need to manually select a mode.
Task Header Bar
When you are inside an active or completed task, a header bar appears at the top of the content area. It contains the following elements:
- Back button: An arrow-left icon with the label "Back" that returns you to the Dashboard or task list.
- Task title: The title or first-line summary of the task prompt. For long titles, the text is truncated with an ellipsis.
- Status badge: A color-coded label showing the current task state:
- COMPLETED (green)
- FAILED (red)
- INTERRUPTED (yellow)
- EXECUTING (blue)
- PLANNING (purple)
- WAITING (gray)
- IM platform badge: For tasks that originated from a listener (Slack, Discord, Telegram, WhatsApp, or another platform), a badge appears showing the platform icon, platform name, and channel or conversation name.
- Timer: While the task is executing, a running timer displays the elapsed time in minutes and seconds.
- Action buttons (visible when the task is not actively running):
- Fork: Creates a copy of the task with the same context and full conversation history. Use this to explore alternative approaches without losing the original.
- Export: Exports the complete task conversation as a styled PDF document.
- Show/Hide: Toggles the visibility of tool call details within the message thread. When hidden, only the tool name and status are shown without the full input/output.
- Context panel toggle (hamburger icon): Opens or closes the right sidebar Context Panel, which shows workflow progress, parallel agent status, thinking content, and usage statistics.
Message Thread
The main area of the task workspace is a scrollable message thread that displays the entire conversation between you and the agent.
User Messages
Your messages appear as right-aligned bubbles. Any file or folder attachments you included are displayed as chips below the message text. File chips show the file name with a small icon, and folder chips show the directory path. Both types have an X button to remove them from context for subsequent messages.
Assistant Messages
Agent responses appear as left-aligned content blocks. During streaming, text arrives progressively with an activity indicator showing the agent's current operation. Responses are rendered as rich Markdown with syntax-highlighted code blocks, LaTeX math (via KaTeX), Mermaid diagrams, tables, and all standard formatting.
Thinking Blocks
When the model uses extended reasoning (thinking mode), thinking blocks appear as collapsible sections prefixed with a thought-bubble indicator. Click to expand and view the agent's internal reasoning chain. By default, thinking blocks are collapsed to reduce visual noise.
Tool Execution Cards
Each tool invocation produces a collapsible card in the message thread. The card header shows:
- The tool name (for example, "bash", "read_file", "web_search").
- An icon representing the tool type.
- A status indicator: a spinning animation during execution, "OK" (green) on success, or "ERR" (red) on failure.
Expanding the card reveals the full input parameters the agent sent to the tool and the complete output the tool returned. For long outputs, the content is scrollable within the card.
Role Dividers
In multi-agent team tasks, a horizontal divider line appears when execution passes from one role to the next. The divider shows the role name and includes a + Memory button that lets you save the agent's work from that role as a persistent memory entry.
Routing Info Badges
When smart routing is active, small badges appear on messages indicating the routing decision, such as "via claude-sonnet-4-20250514 tier" or "routed: Standard". This helps you understand which model was selected for each turn.
Error Messages
Errors from tool execution, API failures, or provider issues appear as red-text blocks within the thread. Error messages include the error type and a description to help you diagnose and resolve the issue.
Load Earlier Button
For tasks with very long histories that exceed the display window, a "Load Earlier" button appears at the top of the thread. Click it to fetch and display older messages from the database.
Activity Indicator
While a task is executing, a fixed bar appears just above the input area. This activity indicator shows:
- A spinner animation confirming that work is in progress.
- Live operation text describing exactly what the agent is doing. Examples:
- The name of the tool currently being executed and its progress.
- A preview of the agent's reasoning when generating text.
- Text generation progress (for example, token count or word count).
- For parallel agents: a count of files created, such as "8 agents generating files... 4 file(s) created by agents".
The activity indicator updates in real-time as the agent moves between operations.
Follow-up Suggestions
After a task completes, a collapsible suggestion bar appears below the last message. It features:
- A question mark icon with a count badge showing how many suggestions are available.
- Clickable suggestion chips that represent contextually relevant follow-up questions or next steps.
- Clicking a chip inserts its text into the input textarea, ready for you to modify or send immediately.
The suggestions are generated automatically based on the conversation context and the agent's output.
Response Actions
When a task reaches a terminal state (completed, failed, or interrupted), several response actions become available:
- Copy Conversation: Copies the entire conversation thread as formatted text to your clipboard.
- Per-code-block Copy: Each code block in the response includes a small Copy button in its top-right corner that copies just that code snippet.
- Per-diagram Save as PNG: Mermaid diagrams rendered in the thread include a "Save as PNG" button that exports the diagram as an image file.
- Per-table Copy as CSV: Tables rendered in the response include a button that copies the table data as comma-separated values.
Scroll to Bottom Button
When you scroll up in a long message thread, a circular down-arrow button appears in the bottom-right corner. Click it to instantly scroll to the most recent message.
Files Created Button
When the agent creates files during task execution, a "Files Created" button appears in the task header area. It displays a count badge (for example, "(7)") showing how many files were produced. Clicking it opens the FileBrowser overlay, a file explorer view that lists all generated files with options to view, download, or open them.
Right Sidebar / Context Panel
The Context Panel is a collapsible right sidebar that provides detailed information about the current task. Toggle it with the hamburger icon in the task header. The panel can be resized by dragging its left edge.
Workflow Progress
For multi-agent team tasks, the Context Panel shows a "Workflow" header with a progress badge (for example, "3/6"). Below it, a vertical stepper displays each agent or role in the workflow with status dots:
- Pending (gray dot): The role has not yet started.
- Active (pulsing blue dot): The role is currently executing.
- Done (green dot): The role has completed its work.
Roles running on remote deployments show a "Remote" badge next to their name.
Parallel Agents
When three or more agents are running in parallel (via fan-out), the Context Panel shows a dedicated Parallel Agents section. It displays a count such as "5/8 completed" and lists a card for each agent with its index number, current status, and the number of tool calls it has made.
Thinking Content
When the model is using extended reasoning, the Context Panel includes a collapsible Thinking Content section that shows the raw reasoning text. A maximize/minimize toggle lets you expand it to fill the panel for easier reading.
Usage Stats
The bottom of the Context Panel displays real-time usage statistics for the current task:
| Metric | Description |
|---|---|
| Context Usage | A color-coded progress bar showing what percentage of the model's context window has been consumed. The bar transitions from green to yellow at 50% and to red at 80% or above. |
| Input Tokens | The number of input tokens sent to the model, formatted with K/M suffixes for readability (for example, "12.3K"). |
| Output Tokens | The number of output tokens generated by the model. |
| Cache Read | The number of tokens served from the provider's prompt cache, if available. |
| Total Cost | The cumulative dollar cost for this task, calculated using the model pricing table. |
| Latency | The time-to-first-token latency in milliseconds for the most recent API call. |
| Model | The name of the model being used for this task. |
| Provider | The name of the provider serving the model. |
Visualizations in the Message Thread
The message thread renders rich content beyond plain text:
Mermaid Diagrams
The agent can output Mermaid code blocks that render as interactive SVG diagrams directly in the chat. Supported diagram types include flowcharts, sequence diagrams, class diagrams, pie charts, timelines, Gantt charts, ER diagrams, state diagrams, and journey maps. Each diagram includes a "Save as PNG" button for export.
Charts
The chart_generate tool produces PNG chart images that are embedded in the message thread. Six chart types are supported: bar (vertical), line, pie, scatter, histogram, and area. All charts use a dark theme that matches the application UI.
Code Blocks
Code blocks in responses are syntax-highlighted using language detection. Each code block has a copy button in its top-right corner. Supported languages include Python, JavaScript, TypeScript, Rust, Go, Java, C++, HTML, CSS, SQL, YAML, JSON, Bash, and many more.
Math
Mathematical expressions are rendered using KaTeX. Inline math uses $...$ delimiters and display math uses $$...$$ blocks.
Tables
Markdown tables are rendered as styled HTML tables with alternating row colors. Each table includes a "Copy as CSV" button for easy export to spreadsheets.
Exporting Conversations
Click Export in the task header bar on any completed task. The conversation opens as a styled HTML page in a new window. Use your browser's Print > Save as PDF to create a shareable document. The exported page includes all text, code blocks, tool call summaries, and images.
Running Multiple Tasks Simultaneously
Multiple tasks can run at the same time, each potentially using a different model. Each task gets its own independent cancel token and event stream. The model that was active when a task was launched is captured for that task's duration, so switching models does not affect already-running tasks. Monitor all running tasks from the Dashboard where their status updates in real-time.
Project Management
Projects let you organize related tasks into groups, scope memory to specific codebases, and keep your workspace tidy.
Creating a Project
Click the + button next to the "Projects" header in the left sidebar. A dialog appears where you enter a project name and optionally select a directory path on disk. The directory path enables project-scoped memory and AURA.md discovery.
Associating Tasks with Projects
Drag and drop any task from the task list onto a project name in the sidebar to associate it. You can also select a project from the project selector in the task composer before creating a new task.
Task Count Badges
Each project in the sidebar displays a badge with the count of tasks associated with it. This count updates in real-time as tasks are created, completed, or deleted.
Project Actions
- Rename (pencil icon): Click the pencil icon next to a project name to rename it.
- Delete: Right-click or use the context menu to delete a project. Deleting a project does not delete the tasks associated with it; they become unassociated.
Multi-Select and Bulk Actions
Hold Ctrl (or Cmd on macOS) and click to select multiple tasks. Checkboxes appear for each task when in multi-select mode. A bulk action bar appears at the top of the task list with options to move selected tasks to a project, delete them, or resume interrupted tasks in batch.
Task Classification & Smart Routing
Every message you send is automatically analyzed by a fast one-shot LLM call that classifies it into a category. This classification happens transparently and requires no manual mode switching.
Classification Categories
| Category | What Happens |
|---|---|
| Single Agent Task | A single agent runs with access to all configured tools. This is the default for most requests that one agent can handle end-to-end. |
| Team Task | The system routes to a multi-agent team. If a matching team already exists, it is selected automatically. If not, the system creates a new team with appropriate roles on the fly. |
| Scheduled Task | If the message describes a recurring task (for example, "every Monday at 9am"), the system creates a schedule automatically. |
| Clarification Needed | The agent asks follow-up questions to gather more information before proceeding. |
Smart Routing
When smart routing is enabled (in Settings), the classifier also evaluates the complexity of your request and routes it to the most cost-effective model tier that can handle it. This saves money on simple requests while preserving quality for complex ones. The routing tiers are configurable (see Smart Routing under Models).
Multi-Agent Teams
Teams let you assign complex tasks to a group of specialized agents that work together, each contributing their domain expertise. Teams are configured in Settings, and tasks are routed to teams automatically by the classifier or manually by selecting a team.
Team Configuration
Go to Settings > Teams to manage teams. The team form includes:
- Team name: A descriptive label for the team.
- Member list: Add roles from the Role Library or create new custom roles. Each member can be added or removed with a single click.
- Role assignments: Each member has a specific role with its own system prompt, tool permissions, and optionally a dedicated model.
- Workflow type: Choose how the team members collaborate:
- Sequence: Roles execute one after another. Each role receives the output from the previous role as context.
- Parallel: All roles execute simultaneously. Results are merged at the end.
- Fanout/Merge: A source role produces a list of sub-tasks, which are distributed to parallel worker roles, then a final role merges the results.
Team Task UI
When a team task runs, the message thread shows role dividers between each agent's section. The Context Panel displays the workflow progress stepper showing which role is active. Role handoffs are visible in the thread: each role calls the role_complete tool to pass structured handoff data (summary, files created, key decisions, constraints) to the next role.
Built-in Roles
Aura Workshop ships with 20 built-in roles organized into six categories. Each role has a pre-written system prompt and a curated set of tool permissions.
Role Library (20 Built-in Roles)
| Category | Roles |
|---|---|
| Software | Product Manager, Architect, Developer, QA Engineer, DevOps |
| Content | Research Lead, Writer, Editor |
| Business | Business Analyst, Marketing Strategist, Sales Copywriter |
| Data | Data Engineer, Data Analyst, Data Scientist |
| Design | UX Designer, UI Designer |
| Operations | Project Coordinator, Technical Writer, Security Auditor, Code Reviewer |
Per-Role Model Selection
Each role in a team can be assigned a specific model. This allows you to use a smaller, cheaper model for routine roles (such as initial research) while reserving a more powerful model for critical roles (such as architecture decisions). If no per-role model is set, the team uses the globally selected model.
Custom Roles
Create, edit, duplicate, or delete roles from Settings > Roles & Prompts. Each role has a name, description, system prompt, and a set of allowed tools (toggled via checkboxes). Roles are saved as markdown files with YAML frontmatter in ~/.aura/roles/.
Fan-Out (Parallel Agents)
When the classifier detects that a task contains independent sub-tasks, it uses fan-out to run multiple agents in parallel. For example, "Compare the top 5 vector databases" spawns five parallel research agents, one per database, then merges their results into a unified comparison.
| Team Type | Source Role Produces | Fan-Out Role Does |
|---|---|---|
| Software Dev | Architect lists implementation tasks | One developer per task |
| Content Writing | Research Lead lists sections | One writer per section |
| Research | Lead lists research questions | One researcher per question |
| Translation | Manager lists target languages | One translator per language |
Parallel results are merged using configurable strategies: CopyOnWrite (safe, no data loss), LastWins (last agent's version prevails on conflict), or LLMResolve (an LLM intelligently combines conflicting changes).
Agent Tools Reference
Every agent has access to a rich set of built-in tools. The available tools depend on role configuration, execution mode, and any connected MCP servers. The full tool registry is defined in tools/mod.rs.
Core Tools (Always Available)
| Tool | Description |
|---|---|
read_file | Read the contents of a file at a given path. Returns the full text of the file. Supports all text-based formats and can handle binary files by returning base64-encoded content. |
write_file | Create a new file or overwrite an existing file with the specified content. Automatically creates parent directories if they do not exist. |
edit_file | Make targeted edits to an existing file using find-and-replace operations. Supports multiple replacements in a single call, making it efficient for surgical modifications without rewriting entire files. |
bash | Execute shell commands on the host operating system. Uses sh -c on macOS and Linux or cmd /C on Windows. Supports setting the working directory, timeout, and environment variables. Subject to safety guardrails that block destructive commands. |
glob | Find files matching a glob pattern such as **/*.ts or src/**/*.py. Returns a list of matching file paths, useful for discovering project structure. |
grep | Search file contents using regular expressions. Returns matching lines with file paths and line numbers, similar to the grep command but with structured output for agent consumption. |
list_dir | List the contents of a directory, returning file names, sizes, types (file or directory), and modification times. |
web_fetch | Fetch content from a URL via HTTP GET. Returns the response body as text. Useful for reading web pages, downloading data, and interacting with web APIs. |
web_search | Search the web using the configured search provider (DuckDuckGo by default, or Google, Brave, Serper, Bing). Returns text results and image URLs. |
email_send | Send an email using the configured email method (system default or SMTP). Supports HTML content, subject line, recipients (to, cc, bcc), and attachments. |
chart_generate | Generate data visualizations as PNG images. Supports bar, line, pie, scatter, histogram, and area chart types with a dark theme. Ideal for creating reports and visual summaries. |
generate_image | Generate images using AI image generation APIs. The agent provides a text prompt and receives a generated image. |
generate_video | Generate video content using AI video generation APIs. Supports various durations and styles depending on the configured provider. |
generate_music | Generate audio and music using AI music generation APIs. Supports various genres, moods, and durations. |
run_chain | Execute a chain script, which is a predefined sequence of tool calls defined in a skill file. This meta-tool enables complex multi-step workflows to be packaged as reusable recipes. |
Docker Tools
| Tool | Description |
|---|---|
docker_run | Run a command inside a Docker container. Supports image selection, volume mounts, port mapping, and environment variables. Useful for running tasks in isolated environments. |
docker_list | List all running Docker containers on the host, showing container ID, image, status, and port mappings. |
docker_images | List all Docker images available on the host, showing repository, tag, image ID, and size. |
Platform Tools
| Tool | Description |
|---|---|
system_notify | Send a native desktop notification with a title and message body. Works on macOS, Windows, and Linux. |
screen_capture | Capture a screenshot of the current display. On macOS, uses the built-in screencapture utility. Returns the image as a file path or base64 data. |
camera_capture | Capture an image from the device's webcam. Currently supported on macOS. Returns the captured image for analysis or inclusion in responses. |
create_listener | Programmatically create a new event listener for any of the 31 supported messaging platforms. The agent can set up real-time monitoring without manual configuration. |
create_schedule | Programmatically create a new scheduled task with a specified time, frequency, and prompt. |
create_webhook | Programmatically create a new webhook endpoint that triggers agent tasks on incoming HTTP requests. |
send_slack | Send a message to a Slack channel or user. Requires a Slack bot token. |
send_email | Send an email message (distinct from email_send in that this tool is specifically for platform-level messaging). |
ocr | Extract text from images using optical character recognition. Combines a vision model with text extraction for high-accuracy results on documents, screenshots, and photos. |
merge_pdfs | Merge multiple PDF files into a single document. Accepts a list of file paths and produces a combined PDF. |
security_scan | Run a security vulnerability scan on code or configurations. Reports potential issues with severity levels and remediation suggestions. |
role_complete | Signal that the current role has finished its work in a multi-agent team workflow. Includes structured handoff data: summary, files created, key decisions, and constraints for the next role. This tool is only available within team workflow contexts. |
MCP Tools
Any tools provided by connected MCP servers are automatically available to agents. MCP tools are named using the pattern mcp_{server_id}_{tool_name}, making them easy to identify. MCP tool results are automatically truncated to 8000 characters to prevent context overflow. MCP tools appear in the Tools picker in the task composer, grouped by their server name.
Command Safety
The agent middleware includes non-bypassable safety guardrails that block dangerous shell commands. These guardrails are always active regardless of settings.
Blocked Patterns
- Recursive deletion of the root filesystem:
rm -rf / - Fork bombs:
:(){ :|:& };: - Disk formatting:
mkfs,ddtargeting block devices - Other destructive patterns targeting system-critical paths or operations
Command Timeout
Shell commands have a default timeout of 60 seconds, with a maximum configurable timeout of 300 seconds. Commands exceeding the timeout are terminated automatically.
Elevated Bash
The elevatedBash setting in General Settings allows agent commands to use sudo. This does not bypass the safety guardrails; dangerous patterns are still blocked even with elevated privileges.
Concurrency
The max_concurrent_tools setting (default: 4) controls how many tools can execute simultaneously within a single task. This prevents resource exhaustion when the agent attempts many parallel tool calls.
Skills Overview
Skills are specialized instruction sets and automation recipes that extend the agent's capabilities. Aura Workshop ships with a comprehensive skills library organized into multiple categories. Skills are managed in Settings > Skills, where you can view the list of installed skills (each showing name, category, and a prompt preview), edit existing skills, delete skills, or create new ones with the New Skill button.
Document Skills
Core document generation skills are bundled with every installation. These enable the agent to create professional documents programmatically:
| Skill | Description |
|---|---|
pdf | Create PDF reports with headers, paragraphs, tables, charts, images, and custom styling. |
docx | Create Microsoft Word documents using python-docx with full formatting, styles, headers, footers, and tables. |
xlsx | Create Excel spreadsheets using openpyxl with formulas, charts, conditional formatting, and multiple sheets. |
pptx | Create PowerPoint presentations using python-pptx with slide layouts, charts, images, and transitions. |
Anthropic Skills (15)
Advanced skills covering design, development, content creation, and tooling:
| Skill | Description |
|---|---|
pdf | Advanced PDF generation with complex layouts |
docx | Advanced Word document workflows |
xlsx | Advanced spreadsheet operations |
pptx | Advanced presentation creation |
algorithmic-art | Generate algorithmic and generative art using code |
brand-guidelines | Create comprehensive brand guideline documents with color palettes, typography, and usage rules |
canvas-design | Design interactive canvas-based visualizations |
doc-coauthoring | Collaborative document co-authoring workflows |
frontend-design | Design and build frontend interfaces with modern frameworks |
internal-comms | Draft internal communications, memos, and announcements |
mcp-builder | Build custom MCP servers from scratch |
skill-creator | Create new skills with proper structure and metadata |
slack-gif-creator | Create animated GIFs optimized for Slack |
theme-factory | Design and generate UI themes with consistent design tokens |
web-artifacts-builder | Build interactive web artifacts (mini-apps, widgets, demos) |
webapp-testing | Test web applications end-to-end with automated test suites |
Superpowers (14)
Meta-skills that enhance how the agent approaches complex tasks. These do not perform actions themselves but guide the agent's reasoning and workflow strategy:
| Skill | Description |
|---|---|
brainstorming | Structured brainstorming and ideation frameworks |
dispatching-parallel-agents | Coordinate multiple agents working in parallel on independent sub-tasks |
executing-plans | Execute multi-step plans methodically, tracking progress and adapting to issues |
finishing-a-development-branch | Complete and polish a feature branch: tests, linting, commit messages, PR preparation |
receiving-code-review | Process and apply code review feedback systematically |
requesting-code-review | Prepare and submit code for review with clear context and description |
subagent-driven-development | Break complex tasks into sub-agent work units for parallel execution |
systematic-debugging | Methodical debugging with hypothesis generation, testing, and root cause analysis |
test-driven-development | Write tests first, then implement code to pass those tests |
using-git-worktrees | Manage parallel git worktrees for concurrent feature development |
using-superpowers | Meta-skill for combining multiple superpowers in a single workflow |
verification-before-completion | Verify all work meets requirements before marking a task as complete |
writing-plans | Create detailed, structured execution plans before beginning work |
writing-skills | Author new skill definitions with proper structure and documentation |
Desktop App Skills (20)
Using the Accessibility API on macOS, the agent can interact with native desktop applications directly. Each application has a dedicated skill covering its specific UI elements, menus, and workflows:
| Application | Capabilities |
|---|---|
| Excel | Create/edit spreadsheets, formulas, formatting, charts |
| Word | Create/edit documents, styles, tables, images |
| PowerPoint | Create/edit presentations, slides, animations |
| Chrome | Navigate pages, interact with web content, capture screenshots |
| Finder | Navigate folders, manage files, organize documents |
| Outlook | Compose/read emails, manage calendar, contacts |
| Numbers | Create/edit Apple Numbers spreadsheets |
| Pages | Create/edit Apple Pages documents |
| Keynote | Create/edit Apple Keynote presentations |
| Slack | Send messages, navigate channels, manage workspace |
| Teams | Send messages, join meetings, manage teams |
| Zoom | Start/join meetings, manage settings |
| Compose/read emails in Apple Mail | |
| Notion | Create/edit pages, databases, and blocks |
| Terminal | Execute commands, manage terminal sessions |
| VS Code | Edit files, navigate projects, run tasks |
| Figma | Create/edit designs, manage components |
| Acrobat | View/edit PDFs, annotations, form filling |
| Salesforce | Navigate records, create/edit objects, run reports |
| SAP | Navigate transactions, enter data, run reports |
Design Skills (5)
| Skill | Description |
|---|---|
taste-skill | Evaluate and refine visual design quality and taste |
impeccable | Produce pixel-perfect, polished design output |
ui-ux-pro-max | Advanced UI/UX design guidance and best practices |
design-audit | Audit designs for consistency, accessibility, and best practices |
typography | Typography selection, pairing, and hierarchy guidance |
Platform Skills
| Skill | Description |
|---|---|
credential-store | Manage encrypted credentials programmatically within agent workflows |
document-analyzer | Analyze documents and extract structured information |
browser-automation | Automate browser interactions via Playwright for testing and scraping |
browser-act | Direct browser action control for real-time web interaction |
browser-act-skill-forge | Create new browser automation skills from recorded actions |
prd | Generate comprehensive product requirements documents |
orchestration | Multi-agent orchestration patterns and coordination strategies |
github | GitHub repository operations: PRs, issues, reviews, workflows |
weasyprint | Generate high-fidelity PDFs from HTML using the WeasyPrint engine |
excalidraw | Create Excalidraw diagrams and sketches programmatically |
pandoc | Document format conversion between Markdown, HTML, DOCX, PDF, and more |
Media and Research Skills (Printing Press Collection)
Over 100 API tools for media generation, research, and data enrichment are available through the Printing Press skill collection. These cover a wide range of capabilities:
| Category | Examples |
|---|---|
| Image Generation | DALL-E, Stable Diffusion, Midjourney-compatible APIs, flux |
| Video Creation | Video generation from text prompts, video editing APIs |
| Music Synthesis | AI music generation, sound effects, audio processing |
| Web Scraping | Structured web data extraction, site crawling |
| Data Analysis | Statistical analysis tools, data transformation, visualization |
| Social Media | Twitter/X, LinkedIn, Instagram APIs for posting and analytics |
| Financial Data | Stock prices, market data, financial news, crypto markets |
| Weather | Current weather, forecasts, historical weather data |
| Translation | Multi-language translation APIs |
| Communication | SMS, push notifications, messaging APIs |
Role Skills
Role skills are specialized instruction sets that can be attached to specific roles. They extend a role's capabilities with domain-specific knowledge and behavioral guidelines. Managed via Settings > Skills, the REST API (/api/role-skills), or the agent's own tools (platform_create_skill, platform_list_skills). Each role skill contains:
- A name and category for organization
- A detailed prompt that provides domain expertise
- Optional chain scripts for automated multi-step workflows
- Metadata for search and discovery
Creating Custom Skills
You can create new skills in two ways:
- Through the UI: Go to Settings > Skills, click "New Skill", fill in the name, category, description, and prompt template.
- Through the skill-creator skill: Ask the agent to "create a skill for [your use case]" and it will use the skill-creator meta-skill to generate a properly structured skill definition.
Skills can include chain scripts, which are sequences of tool calls that run in order. The run_chain tool executes these scripts, enabling complex multi-step workflows to be packaged as reusable one-click recipes.
MCP Servers
Model Context Protocol (MCP) servers extend the agent's tool capabilities by connecting to external services and data sources.
Transport Types
| Transport | Description |
|---|---|
| stdio | Launches a local process and communicates via stdin/stdout. Used for most MCP servers that run as command-line programs (Node.js, Python, etc.). |
| HTTP | Connects to a remote MCP server via HTTP with Server-Sent Events for streaming. Used for hosted or cloud-based MCP services. |
Configuration (Settings > MCPs)
- Go to Settings > MCPs.
- Click Add MCP Server.
- Enter a descriptive name for the server.
- Select the transport type (stdio or HTTP).
- For stdio: enter the command to launch the server (for example,
npxorpython) and its arguments. You can also set environment variables that will be passed to the child process. - For HTTP: enter the server URL. Optionally configure OAuth credentials (client_id and client_secret) or custom headers (Bearer token, API key, or other authentication headers).
- Select the isolation mode:
- shared: A single connection is used across all tasks. This is efficient but means state persists between tasks.
- per_task: Each task gets its own MCP server instance, providing complete state isolation between tasks.
- Click Connect to establish the connection. A status indicator shows whether the server is connected (green) or disconnected (red).
Tool Definition Caching
When an MCP server connects, Aura Workshop caches its tool definitions so that they appear instantly in the tool picker without re-querying the server on each task.
Import from JSON
You can import MCP server configurations from a JSON file. This is useful for sharing configurations across team members or machines. The import format follows the standard MCP configuration schema.
Auto-Seeded MCP Servers
Aura Workshop automatically configures certain MCP servers based on your installed tools and API keys:
| Server | Trigger | Description |
|---|---|---|
| Playwright Browser | Always available | Browser automation: navigate, click, type, screenshot, extract elements |
| Z.AI Vision | z.ai API key configured | 8 vision tools for image analysis, OCR, and visual understanding |
| Z.AI Zread | z.ai API key configured | Read and extract content from GitHub repos and documentation |
| MiniMax Coding Plan | MiniMax API key configured | Web search and image understanding for structured coding plans |
Schedules
Schedules let you define tasks that run automatically at specified times. Access the Schedules tab from the Schedulers icon in the left sidebar navigation.
Schedule List
The list view shows all configured schedules. Each row displays:
- Title of the schedule
- A preview of the prompt text
- Schedule type (daily, weekly, cron, or once)
- Next run time
- An enabled/disabled toggle switch
- Edit and Delete action buttons
Create / Edit Schedule Form
- Title: A descriptive name for the schedule.
- Prompt textarea: The task description that will be sent to the agent on each execution.
- Schedule type selector:
- Once: Runs a single time at the specified date and time.
- Daily: Runs every day at the specified time (HH:MM picker).
- Weekly: Runs on selected days of the week at the specified time. A day selector lets you check one or more days (Mon through Sun).
- Cron: Runs on a custom cron expression for advanced scheduling.
- Time (HH:MM): The time of day to execute, in 24-hour format.
- Date (for "once" type): A date picker for one-time schedules.
- Duration type:
- once: Execute a single time and then deactivate.
- repeat_until: Repeat until a specified end date.
- forever: Repeat indefinitely until manually disabled.
- Project path: Optionally associate the schedule with a project directory.
- Model selector: Choose a specific model for scheduled task execution.
- Role selector: Assign a role to the scheduled task.
- Target agent selector: Route the scheduled task to a specific remote agent deployment.
Listeners (31 Platforms)
Listeners monitor external messaging platforms and trigger agent tasks based on incoming messages and events. Access the Listeners tab from the Listeners icon in the left sidebar.
Listener List
Each row in the list shows:
- Platform icon and name
- Workspace or channel name
- Status indicator (online with green dot, offline with gray dot)
- Last received message timestamp
- Enabled/disabled toggle
- Action buttons: View Logs, Edit, Delete
Supported Platforms (31)
| Category | Platforms |
|---|---|
| Messaging | WhatsApp, Telegram, Discord, Slack, Signal, Matrix, IRC, XMPP, Microsoft Teams, LINE, Facebook Messenger, WeChat, iMessage (macOS), Google Chat, Feishu/Lark |
| Email (IMAP/SMTP), Gmail (API) | |
| Social | Twitter/X, Mastodon, Bluesky, Reddit, Twitch, Nostr, Zalo |
| Collaboration | Zulip, Rocket.Chat, Mattermost, Nextcloud Talk, Synology Chat |
| Built-in | Webchat, Chatbot widget |
Create / Edit Listener Form
- Platform selector: Choose from the 31 supported platforms.
- Token input: Enter the authentication token or credentials for the selected platform.
- Trigger type:
- Mentions: Only respond when the bot is mentioned by name or tag.
- Keywords: Only respond to messages containing specific keywords.
- All: Respond to every incoming message.
- Keywords: A comma-separated list of trigger keywords (when using keyword trigger type).
- Custom prompt: Define what the agent should do when a message matches the trigger rules.
- Model selector: Choose a specific model for listener-triggered tasks.
Listener Detail View
Clicking "View Logs" on a listener opens the detail view, which shows:
- An event log table with columns: timestamp, incoming message, sender/user, agent response
- Memory association settings for the listener
- Connection status and diagnostics
Rules Engine
Each listener has a configurable rules engine that filters incoming messages:
- Sender blocklists: Ignore messages from specific senders or bots.
- Channel blocklists: Ignore messages from specific channels.
- Keyword triggers: Only process messages containing specific words or phrases.
- Mention requirements: Only process messages that mention the bot.
- Allowlists: Only process messages from specific approved senders.
Webhooks
Webhooks let you receive HTTP requests from external services and trigger agent tasks. Access the Webhooks tab from the Webhooks icon in the sidebar.
Webhook List
Each row shows:
- Webhook name
- Auto-generated endpoint URL (with a copy button)
- Last received timestamp
- Enabled/disabled toggle
- Action buttons: View Logs, Edit, Delete
Create / Edit Webhook Form
- Name: A descriptive name for the webhook.
- Custom prompt: A prompt template that defines what the agent should do when the webhook is triggered. Use
{{payload}}to insert the request body. - HMAC-SHA256 secret: An optional shared secret for request validation. A visibility toggle lets you show/hide the secret value.
- Model selector: Choose a specific model for webhook-triggered tasks.
- Role selector: Assign a role for the triggered task.
- Target agent: Route to a specific remote deployment.
- Project path: Associate with a project directory.
Webhook Detail View
Shows the auto-generated webhook URL with a copy button, a table of recent invocation logs (timestamp, request body, response status), and a cURL example for testing.
Slash Commands
Custom slash commands let you define shortcut commands that trigger specific agent behaviors.
Slash Command List
Each row shows the command name (with / prefix), description, handler type, enabled/disabled toggle, and Edit/Delete buttons.
Create / Edit Slash Command Form
- Command name: The name of the command (without the / prefix).
- Description: A brief description shown when listing available commands.
- Prompt template: The prompt that is sent to the agent when the command is invoked.
- Handler type: How the command is executed (agent_task, webhook, etc.).
Workflows (DAG-Based Automation)
Workflows provide a visual, directed acyclic graph (DAG) based automation system for building complex multi-step pipelines. Available in Business and Enterprise tiers.
Visual Workflow Editor
The workflow editor is a canvas-based interface where you create and connect nodes to define execution flow. You can:
- Add nodes by clicking the "Add Node" button and selecting a node type.
- Connect nodes by drawing edges between ports.
- Configure each node by clicking it to open the node configuration sidebar.
- Delete nodes and edges with the Delete key or right-click context menu.
- Pan and zoom the canvas for large workflows.
11 Node Types
| Node Type | Description |
|---|---|
| agent_task | Run a single agent with a prompt and full tool access. Configure the prompt, model, role, and project path. |
| conditional | Branch execution based on conditions evaluated against data from previous nodes. Supports equality, comparison, and pattern matching. |
| delay | Pause execution for a specified duration (seconds, minutes, hours). |
| fan_out | Split work across multiple parallel agents. Define the split criteria and the number of parallel branches. |
| merge | Combine results from parallel branches. Configurable merge strategies: CopyOnWrite, LastWins, LLMResolve. |
| human_loop | Pause execution and wait for human approval. Displays a prompt and approve/reject buttons in the UI. |
| script | Execute a shell script or command. Useful for build steps, deployments, or data transformations. |
| team | Run a multi-agent team as a single workflow step. Select the team and provide the task prompt. |
| transform | Transform data between nodes using expressions. Map, filter, or reshape data flowing through the workflow. |
| validate | Validate data against a schema or set of rules before allowing execution to continue. |
| webhook | Send or receive an HTTP request as part of the workflow. Useful for triggering external services. |
Retry Policies
Each node can be configured with a retry policy that defines what happens when it fails:
- Exponential backoff: Retries with exponentially increasing delays.
- Linear backoff: Retries with a fixed delay between attempts.
- Static: Retries immediately with no delay.
- Maximum retry count is configurable per node.
Routing Ports
Nodes have routing ports that determine execution flow:
- pass / approve (success port): Execution continues on the success path.
- fail / reject (failure port): Execution routes to the failure handler path.
Workflow Features
- Approval gates: Human-in-the-loop approval via the
human_loopnode. The workflow pauses and waits until a human approves or rejects. - Data passing: Nodes pass structured data to downstream nodes through the workflow context. Each node can read outputs from any predecessor.
- Auto-parallel detection: The workflow engine uses an LLM to automatically detect independent branches and runs them in parallel.
- Crash recovery: Workflow runs are checkpointed. If the application crashes mid-workflow, execution resumes from the last completed step.
- Pause/resume: Running workflows can be paused and resumed at any time.
- Import/Export: Workflows can be exported as JSON and imported into other Aura Workshop instances.
Workflow Execution Tracking
Each workflow run is tracked with per-step execution status. The workflow run record includes:
- Run ID, workflow ID, and start time
- Overall status: running, completed, failed, paused, or waiting_approval
- Per-node status: pending, running, completed, failed, or skipped
- Data context: the accumulated data from all completed nodes
- Error details for any failed nodes
View workflow runs from Settings > Workflows or via the /api/workflow/runs/{run_id} endpoint.
Workflow Best Practices
- Start simple: begin with linear workflows (agent_task nodes in sequence) before adding branching and parallelism.
- Use human_loop nodes for critical decisions that require human judgment.
- Set appropriate retry policies: exponential backoff for API calls, static for script execution.
- Use transform nodes to reshape data between steps rather than asking agents to do data formatting.
- Test workflows with small inputs before scaling up.
- Export working workflows as JSON backups before making changes.
Cloud Providers (Direct API)
Aura Workshop connects directly to the following provider APIs. Each provider is shown as an expandable card on the Models page.
| Provider | API Base URL | Models |
|---|---|---|
| Anthropic | api.anthropic.com | Claude family (Opus, Sonnet, Haiku) for advanced reasoning, analysis, and code |
| OpenAI | api.openai.com | GPT series (GPT-4o, o3, o4-mini) for versatile general-purpose tasks |
generativelanguage.googleapis.com | Gemini multimodal models with large context windows | |
| MiniMax | api.minimax.io | MiniMax models for text, voice, and video generation |
Provider Authentication Details
| Provider | Auth Type | Header | Free Tier |
|---|---|---|---|
| Anthropic | API key header | x-api-key: sk-ant-... | No |
| OpenAI | Bearer token | Authorization: Bearer sk-... | No |
| Query parameter | ?key=AIza... | Yes (Flash, Flash-Lite) | |
| MiniMax | Bearer token | Authorization: Bearer ... | No |
Configuring a Cloud Provider
- Navigate to the Models page by clicking the Models icon in the sidebar.
- Click a provider card to expand it.
- Enter your API key in the key field.
- Select a model from the dropdown (pre-populated with available models for that provider).
- Adjust sampling parameters if needed (temperature, top_p, top_k, max_tokens).
- Click Save or Apply. The model immediately becomes available in the quick model switcher.
Aggregator Services
Aggregator services provide access to many models through a single API key. All use OpenAI-compatible APIs with Bearer token authentication.
| Aggregator | Highlights |
|---|---|
| OpenRouter | Unified API for 200+ models with automatic fallback. Many free model options available. |
| Together AI | Fast inference for open-source models: Llama, Qwen, DeepSeek, Mistral, Gemma. |
| Groq | Ultra-fast LPU inference with sub-second latency. Free tier with rate limits. |
| DeepSeek | High-performance reasoning and coding at low cost. |
| SiliconFlow | Cost-effective GPU cloud inference for Qwen and DeepSeek models. |
| Zhipu AI (z.ai) | GLM models with free tiers available. Auto-seeds MCP servers for vision and code reading. |
| Xiaomi / MiMo | Xiaomi's MiMo models for general tasks. |
| Moonshot / Kimi | Multilingual, long-context models optimized for Asian languages. |
| Mistral AI | European models strong in coding and multilingual tasks. |
Local Providers
Run models on your own hardware with zero API cost. Aura Workshop auto-detects local inference servers on standard ports.
| Provider | Default Port | Description |
|---|---|---|
| Aura AI | 8080 | Bundled inference engine (Go + llama.cpp) with Metal/CUDA/CPU support. |
| Ollama | 11434 | Popular local model runner with a simple pull-and-run workflow. |
| LM Studio | 1234 | Desktop app for running local models with an OpenAI-compatible API. |
| LocalAI | varies | Self-hosted AI inference with OpenAI-compatible endpoints. |
| vLLM | varies | High-throughput LLM serving with PagedAttention for production workloads. |
| TGI | varies | HuggingFace Text Generation Inference server. |
| SGLang | varies | Fast serving framework for large language models. |
Custom Providers
Add any OpenAI-compatible endpoint as a custom provider. This allows you to connect to any inference service that speaks the OpenAI API format.
- Name: A human-readable label for the provider.
- Base URL: The API base URL (for example,
https://my-vllm-server.com/v1). - API key: Authentication key if required.
- API format: Select from OpenAI, Anthropic, Google, or OpenAI-compatible.
- Auth type: Bearer token, API key header, query parameter, or none.
- Test connection: Verify the endpoint is reachable before saving.
Aura AI (Built-in Local Inference)
Aura Workshop bundles the aura-inference engine (Go + llama.cpp) for running GGUF models locally. It supports Metal acceleration on macOS, CUDA on Linux and Windows, and CPU fallback. Zero API cost: everything runs entirely on your hardware.
Start / Stop Controls
On the Models page, the Aura AI section provides a Launch button to start the local inference server and a Stop button to shut it down. When running, the status indicator turns green.
Port Configuration
The default port is 8080. You can change it in the Aura AI settings panel. Make sure the chosen port is not in use by another application.
Model Selector
Click Scan to detect GGUF models in ~/.cache/huggingface/hub/ and ~/.cache/aura-inference/models/. Select a model from the dropdown. Both HuggingFace-cached and directly downloaded GGUF files are discovered.
Advanced Parameters
| Parameter | Description | Default |
|---|---|---|
| GPU Layers | Number of model layers offloaded to GPU. Set to -1 to offload all layers. | -1 |
| Context Size | Token context window size. | 4096 |
| Batch Size | Inference batch size for throughput optimization. | 512 |
| Flash Attention | Enable flash attention for faster inference on supported hardware. | Off |
| KV Cache Key Type | Data type for the key cache: q8_0 (quantized) or f16 (full precision). | q8_0 |
| KV Cache Value Type | Data type for the value cache: q8_0 or f16. | q8_0 |
| Thinking Mode | Enable extended reasoning/chain-of-thought for local models that support it. | Off |
HuggingFace Model Downloader
Download GGUF models directly from HuggingFace without leaving the app. Nine curated models are available with one-click download:
| Model | Size | Best For |
|---|---|---|
| Qwen3.5 9B | ~5.8 GB | Best balance of quality and speed for general tasks |
| Qwen3 Coder 8B | ~5 GB | Code generation and programming tasks |
| Qwen3 VL 8B | ~5 GB | Vision + language multimodal tasks (image analysis) |
| Llama 3.3 70B | ~42 GB | Top-tier quality for complex reasoning (requires significant RAM/VRAM) |
| Llama 3.1 8B | ~4.9 GB | Reliable tool calling and function execution |
| Mistral Small 24B | ~14 GB | Multilingual tasks with vision capabilities |
| Phi-4 14B | ~8.4 GB | Strong reasoning and math |
| Gemma 4 E4B | ~5 GB | Compact vision model for image understanding |
| Gemma 4 27B-A4B | ~16.8 GB | Mixture of Experts model with excellent efficiency |
Curated Model Table
Each model row in the table has:
- Download button: Start downloading the model.
- Cancel button: Appears during download to cancel the operation.
- Delete button: Remove a downloaded model from disk.
- Progress bar: Shows download progress with percentage.
- ETA and speed: Estimated time remaining and current download speed.
Custom Model Download
Enter any HuggingFace repo ID (for example, Owner/ModelName-GGUF) in the custom download input. The app auto-resolves GGUF repos and downloads Q4_K_M quantization by default. Provide a HuggingFace token for gated models. Downloads are verified against SHA2 checksums.
Ollama Integration
If Ollama is running locally on http://localhost:11434, Aura Workshop auto-detects it and lists available models. No API key is required.
Pull Input
On the Models page under the Ollama section, enter a model name (for example, llama3.1) in the pull input field and click the Pull button. Ollama downloads the model and makes it available immediately.
List and Delete
The Ollama section shows a list of all locally available Ollama models. Each model has a Delete button to remove it.
Inference Cluster (LAN GPU Sharing)
The inference cluster feature enables distributed GPU inference across multiple machines on your local network. Combine the GPU resources of several computers to run larger models than any single machine could handle.
Discovery
Aura Workshop nodes broadcast their presence via UDP on port 18801 every 30 seconds. Nodes that have not broadcast for 90 seconds are considered expired and removed from the cluster view. LAN discovery can be toggled with the lan_discovery_enabled setting.
Roles
- Master: Manages the cluster, coordinates model layer distribution across workers, and serves the unified inference API.
- Worker: Contributes GPU resources to the cluster via RPC on port 50052.
Worker Claiming
The master discovers workers via UDP broadcast. To claim a worker, the master sends an HTTP POST to the worker's /api/cluster/join endpoint. The worker stores the master's information and starts its RPC server. The claiming process works as follows:
- The master machine's Models page shows a "LAN Nodes" section listing all discovered workers on the network.
- Each discovered worker shows its hostname, IP address, available GPU resources, and current state.
- Click the Add button next to a discovered worker to claim it.
- The master sends a claim request to the worker. The worker accepts and starts its RPC server on port 50052.
- Once claimed, the worker appears in the "Claimed Workers" list.
- When you start inference, the master distributes model layers across all claimed workers based on their GPU capabilities.
Workshop Modes
Each Aura Workshop instance can operate in one of three cluster modes, configured in the Models page:
- Standalone (default): Single-machine inference. No cluster features. All model layers run on the local GPU.
- Cluster Manager: Acts as the inference master, managing multiple workers remotely. Coordinates model distribution and serves the unified inference API.
- GPU Contributor: Acts as an inference worker, contributing GPU resources to a master. The worker does not serve its own inference API.
Parameter Control
Cluster inference supports the same parameters as standalone Aura AI: quantization, context size, batch size, GPU layers, and flash attention. Parameters are set on the master and applied across the cluster.
Docker Cluster Deployment
# Master node
docker run -d --net=host --gpus all \
-e AURA_DAEMON_MODE=inference-master \
coolkoo/aura-workshop:daemon-latest \
--mode inference-master
# Worker node
docker run -d --net=host --gpus all \
-e AURA_DAEMON_MODE=inference-worker \
coolkoo/aura-workshop:daemon-latest \
--mode inference-worker
Use --net=host for UDP broadcast discovery and --gpus all for NVIDIA GPU access.
Quick Model Switching
Click the model name displayed in the top bar at any time to open a dropdown of all configured and available models. The dropdown groups models by provider for easy navigation. Selecting a model instantly switches the active model for all subsequent new tasks. Models from cloud providers, aggregators, local inference, and custom providers are all listed together. A search field at the top of the dropdown lets you filter models by name.
Model Auto-Detection
When you configure a provider's API key, Aura Workshop automatically queries the provider to discover available models. This means the model selector always shows current, accurate model options without manual configuration. For local providers (Ollama, Aura AI), clicking Scan refreshes the local model list.
Model Parameters
Each model can be configured with sampling parameters that affect output quality and style:
| Parameter | Default | Description |
|---|---|---|
| Temperature | 0.7 | Controls randomness. Lower values produce more focused output; higher values are more creative. |
| Top P | 0.8 | Nucleus sampling threshold. Only tokens with cumulative probability above this threshold are considered. |
| Top K | 20 | Only the top K most likely tokens are considered at each step. 0 disables Top K filtering. |
| Min P | 0.0 | Minimum probability threshold. Tokens below this probability are discarded. |
| Repeat Penalty | 1.0 | Penalty applied to repeated tokens. Values above 1.0 discourage repetition. |
| Max Tokens | 4096 | Maximum number of tokens the model can generate in a single response. |
| Thinking Level | Off | Extended reasoning depth: Off, Low, Medium, High. Only available on models that support thinking. |
Smart Routing
Smart routing automatically selects the most cost-effective model for each task based on its complexity.
Enable / Disable
Toggle smart routing in Settings > Routing. When disabled, all tasks use the globally selected model.
Tier Configuration
Four routing tiers are available, each with its own model selector and boundary score:
| Tier | Intended Use |
|---|---|
| Simple | Quick questions, translations, simple formatting. Routed to the cheapest model. |
| Standard | Moderate tasks: writing, analysis, single-step coding. |
| Complex | Multi-step tasks: architecture, debugging, research. Routed to a capable model. |
| Reasoning | Tasks requiring deep reasoning, math, or extended chain-of-thought. Routed to the most powerful model. |
Free Models Only Toggle
A toggle restricts routing to free models only (such as those available on OpenRouter or Groq free tier).
Analytics View
The routing analytics section shows detailed statistics about smart routing performance:
- Tasks per tier: Bar chart showing how many tasks were classified into each tier (Simple, Standard, Complex, Reasoning).
- Actual cost: The real cost of running all tasks with smart routing enabled.
- Hypothetical cost: What the cost would have been if all tasks used the most capable (most expensive) model.
- Total savings: The dollar amount saved by using smart routing. Calculated as hypothetical minus actual.
- Savings percentage: The percentage of costs saved relative to the hypothetical total.
The analytics update in real-time as new tasks are routed. Reset the analytics data from the billing reset button.
Settings: General
- Native mode toggle: Switch between native execution (commands run directly on your OS) and Docker execution (commands run in isolated containers).
- Sampling parameters: Set default values for Temperature (0.7), Top P (0.8), Top K (20), and Max Tokens (4096) that apply to all new tasks.
- Thinking level: Set the default reasoning depth for models that support it: Off, Low, Medium, High.
- Execution isolation: Choose the isolation level for agent commands:
- none: Commands run directly on the host (default for native mode).
- sandbox: Commands run in a restricted process sandbox.
- container: Commands run inside Docker containers.
- Execution backend: Where commands run: local, docker, ssh, or singularity (for HPC environments).
- Max concurrent tools: Maximum number of tools that can run simultaneously within a single task (default: 4).
- Elevated bash toggle: Allow agent bash commands to run with elevated privileges (sudo).
- Device capabilities: Toggle screen capture, camera capture, and system notifications on or off.
- Agent context: Toggle automatic injection of git status and recent commits into the system prompt.
Context Compression
Aura Workshop automatically manages context window limits to prevent long-running tasks from failing due to token limits:
- Threshold: Compaction triggers when context usage reaches 70% of the active model's context window.
- LLM summarization: Older conversation history is summarized by the LLM into a condensed form.
- Pair preservation:
tool_useandtool_resultmessage pairs are always kept intact during compaction. - Truncation fallback: If summarization fails, the system falls back to mechanical truncation of the oldest messages.
- Circuit breaker: After 3 consecutive compaction failures, the system switches to permanent truncation mode to ensure the task can continue.
- Persistent state: The compressed state is saved to the
task_memorydatabase table, so progress survives crashes and can be resumed.
Crash Safety and Task Resume
The full conversation transcript is persisted to the SQLite database before every API call. If the application crashes or is force-quit, no conversation data is lost. Key features:
- Task checkpoints are saved at key milestones during execution.
- Interrupted or crashed tasks can be resumed from the Dashboard by clicking the Resume button.
- Bulk resume: Select multiple interrupted tasks and resume them all at once.
- Workflow checkpoints: Multi-agent workflows save progress per role, so a crash mid-workflow resumes from the last completed role handoff.
- Context compression state is preserved, so resumed tasks maintain their compressed context.
Provider Health and Circuit Breaker
Provider health is tracked passively after each API request (zero overhead). If a provider fails 5 consecutive times, the circuit breaker opens and the agent automatically switches to the next provider in the fallback order. The circuit recovers after 60 seconds with a single probe request. You can view provider health status in Settings > Billing.
Execution Modes
| Mode | Description |
|---|---|
| Native | Commands run directly on your host OS via sh -c (macOS/Linux) or cmd /C (Windows). Default when Docker is not available. No container overhead. |
| Docker | Commands run in isolated Docker containers. Provides sandboxing and dependency isolation at the cost of startup latency. |
Execution Backend
The execution_backend setting determines where agent commands physically run:
- local: Commands run on the local machine (default).
- docker: Commands run in Docker containers.
- ssh: Commands run on a remote machine via SSH.
- singularity: Commands run in Singularity/Apptainer containers (for HPC environments).
Settings: Security
- Biometric authentication toggle: Enable Touch ID (macOS) or Windows Hello for accessing credentials and sensitive operations.
- Password lock: Set a password that must be entered to access the Settings panel. Provides an additional layer of protection beyond biometrics.
- License key entry: Enter and activate your license key.
- License status display: View the current license tier, expiration date, and feature entitlements.
- Session management: View and revoke active JWT sessions for the web UI.
Settings: Connectivity
| Setting | Description | Default |
|---|---|---|
| Web Server Enabled | Toggle the embedded HTTP server on or off | On |
| Port | HTTP port for the web server | 18800 |
| Auth Token | Optional Bearer token for remote access authentication. When set, all API requests must include this token. | Empty (no auth) |
Environment variable overrides: AURA_WEB_ENABLED, AURA_WEB_PORT, AURA_WEB_TOKEN.
Settings: Integrations
- Web Search provider: Select from DuckDuckGo (no key required), Google Custom Search (requires API key + search engine ID), Brave Search (requires API key), Serper (requires API key), or Bing Search (requires API key).
- Email configuration: Choose between Auto (system default mail client) or SMTP (manual server configuration with host, port, username, password, from name, and from address). Includes a Test Email button.
- Voice / TTS provider: Enable or disable text-to-speech. Select provider: System (OS built-in), OpenAI (requires API key), or ElevenLabs (requires API key). Choose voice and speech rate.
- Speech-to-text provider: Select from Whisper, System, Groq, OpenAI, or xAI for voice input transcription.
Settings: Skills
- Skill list: All installed skills with columns for name, category, and a preview of the prompt text.
- Edit button: Opens the skill editor where you can modify the skill's prompt, metadata, and chain configuration.
- Delete button: Remove a skill from the installation.
- New Skill button: Create a brand new skill with a name, category, description, and prompt template.
- Skill settings: Per-skill configuration overrides for advanced customization.
Settings: Plugins
Plugins extend Aura Workshop with additional capabilities beyond the built-in features.
- Plugin list: View all installed plugins with their name, version, status (enabled/disabled), and description.
- Enable/Disable toggle: Toggle individual plugins on or off. Disabled plugins are not loaded and consume no resources.
- Plugin configuration: Each plugin may expose its own configuration fields. Click the settings icon next to a plugin to access its specific settings.
- Plugin installation: Install new plugins from the plugin registry or by uploading a plugin package.
- Plugin updates: Check for and apply updates to installed plugins.
Plugins are managed via the REST API at /api/plugins for programmatic control.
Settings: Design
The Design settings control visual aspects used by the design skills and generated artifacts.
- Design system tokens: View and manage design system tokens including:
- Color palettes: primary, secondary, accent, background, surface, text colors
- Typography: font families, sizes, weights, line heights
- Spacing: margin and padding scales
- Border radii: corner radius values for UI components
- Shadow definitions: elevation levels for depth effects
- Theme selection: Choose from available UI themes (dark, light, or custom) for the application interface.
- Custom themes: Create and save custom themes by adjusting the design token values. Custom themes are persisted to the database.
Design tokens are available to agents via the /api/design-systems endpoint and are used by the design skills (taste-skill, impeccable, ui-ux-pro-max, etc.) when generating UI artifacts.
Settings: MCPs
Manage MCP server connections (see the MCP Servers section for full details). This tab provides the same interface described there: add, edit, delete, connect, disconnect, configure isolation mode, set OAuth credentials, and import from JSON.
Settings: Data Management
| Action | Description |
|---|---|
| Clear conversation history | Delete all task messages and conversation data from the database. Tasks themselves are preserved. |
| Reset API keys | Remove all stored provider API keys from the encrypted credential store. |
| Clear model cache | Remove cached model metadata and force re-fetching from providers. |
| Reset database | Full database reset to factory defaults. All settings, tasks, conversations, and configurations are deleted. |
| Reset app data | Complete application reset including all settings, database, downloaded models, and cached files. |
| Diagnostics | View system information (OS, architecture, memory, disk), database statistics (table counts, sizes), and dependency status. |
Each destructive action requires confirmation before execution.
Settings: Teams
- Team list: View all configured teams with their names, member roles, and workflow type (sequence/parallel/fanout).
- New Team button: Opens the Team Builder form to create a new team.
- Edit: Modify team composition, roles, or workflow type.
- Delete: Remove a team configuration.
- Quick-start templates: Pre-configured team templates for common configurations (Software Dev Team, Content Writing Team, Research Team).
Settings: Roles & Prompts
- Role list: All 20 built-in roles plus any custom roles, grouped by category (Software, Content, Business, Data, Design, Operations).
- Role editor: Create or edit roles with fields for name, description, category, system prompt (multi-line text), and tool permissions (checkboxes for each available tool).
- Duplicate button: Clone an existing role as a starting point for a new custom role.
- Delete button: Remove custom roles. Built-in roles cannot be deleted.
- Per-role model selection: Optionally assign a specific model to a role.
Settings: Workflows
The visual workflow editor provides the canvas-based interface described in the Workflows section. From this settings tab you can:
- Create new workflows
- Edit existing workflows in the visual editor
- Import workflows from JSON files
- Export workflows as JSON for sharing
- Delete workflows
- View workflow run history
Settings: Credentials
- Credential list: Shows all stored credentials with name, type, and created/updated timestamps. Values are never displayed in the list.
- Add credential: Create a new credential entry with a name, type (API key, token, password, SSH key), and value.
- View credential: Retrieve a credential's decrypted value. Requires biometric authentication (Touch ID or Windows Hello).
- Delete credential: Remove a credential permanently.
- Encryption: All values are encrypted with AES-256-GCM before writing to the database. The encryption key is stored in the system keychain.
- Usage: Credentials are referenced by ID in listeners, webhooks, MCP server connections, and remote deployments, so tokens are never exposed in configuration files.
Settings: Cloud Storage
- AWS S3: Connect with access key and secret key. Configure bucket name and region.
- Google Cloud Storage: OAuth-based authentication. Configure bucket name.
- Azure Blob Storage: Connection string or managed identity. Configure container name.
Connected cloud storage is used by agents for storing generated files, backups, and shared artifacts.
Settings: Memory
- View memories: List all saved agent memories organized by type (user, feedback, project, reference). Each entry shows a title, content preview, and timestamps.
- Add memory: Manually create a new memory entry.
- Edit memory: Modify an existing memory entry.
- Delete memory: Remove individual memory entries.
- Search and filter: Filter memories by type, scope (user vs. project), or search by content.
- Memory locations:
~/.aura/memory/for user-scope memories,.aura/memory/in the project root for project-scope memories. - AURA.md editor: Create and edit project instruction files (see Cross-Session Memory for details).
- Learned facts viewer: Browse, search, and delete LLM-extracted memory facts. Facts are color-coded by category with confidence percentages.
Settings: Routing
- Smart routing toggle: Enable or disable automatic cost-based model routing.
- Tier configuration: Set the model and boundary score for each tier (Simple, Standard, Complex, Reasoning).
- Free models only toggle: Restrict routing to free models.
- Routing analytics: View tasks per tier, actual vs. hypothetical cost, and total savings.
Settings: Billing & Usage
- Usage dashboard: Summary cards for today's cost, this month's cost, total cost, and today's tokens.
- Spend limits: Set maximum monthly spend per provider. When reached, the system switches to the next provider in the fallback order.
- Provider fallback order: Define the priority sequence for automatic provider switching.
- Model pricing table: Editable pricing with input/output cost per million tokens. Pre-seeded with current market rates. Reset to defaults button.
- Daily usage charts: Interactive bar chart showing daily spend over the last 14 days, plus area charts for per-model daily breakdown.
- Provider summary table: Per-provider totals with input/output token counts and costs.
Settings: Commands
- Slash command list: View all custom slash commands with name, description, handler type, and enabled status.
- Create / edit / delete slash commands (see Slash Commands).
Settings: Models
- Provider API key management: Enter and manage API keys for all supported providers.
- Provider URL overrides: Override the default base URL for any provider (useful for proxies or enterprise endpoints).
- OpenAI Organization / Project IDs: Set optional Organization ID and Project ID for OpenAI API calls.
- Model parameter defaults: Set global default sampling parameters.
Settings: Updates
- Check for updates: Query the update server for new versions.
- Version comparison: See a changelog of what has changed between your current version and the latest.
- Download and install: One-click update with automatic restart. The application downloads the update, verifies integrity, and restarts.
- Update toast: A notification appears automatically when an update is available, with a button to apply it.
Spend Tracking & Billing
Spend Summary Cards
At the top of the Billing tab, four summary cards provide at-a-glance metrics:
- Today's cost: Total spend across all providers for the current day.
- This month's cost: Cumulative monthly spend.
- Total cost (all time): Lifetime spend since first use.
- Per-provider breakdown: A card for each active provider showing its individual monthly spend.
Daily Usage Chart
An interactive bar chart visualizes daily spend over the last 14 days. Hover over any bar to see the exact cost for that day. Below the bar chart, area charts show per-model daily usage so you can identify which models are driving costs.
Model Usage Breakdown Table
A detailed table lists every model used during the billing period, with columns for model name, provider, input tokens, output tokens, number of requests, and total cost.
Spend Limits
Set a maximum monthly spend for each provider. The system enforces the limit by automatically switching to the next provider in the fallback order when a limit is reached. You can also set alert thresholds that trigger a notification before the hard limit is hit.
Provider Fallback Order
Define the priority sequence for provider switching. Drag and drop providers to reorder them. When the primary provider hits its spend limit, encounters rate limits (HTTP 429), or returns server errors (HTTP 503), the system transparently switches to the next provider in the list.
Provider Pricing Override
The model pricing table shows input and output cost per million tokens for every model. These values are pre-seeded with current market rates but can be edited manually. A "Reset to Defaults" button restores the original pricing data.
Usage Stats in Context Panel
During task execution, the Context Panel (right sidebar) shows real-time usage: Context Usage percentage (with color-coded bar), Input Tokens, Output Tokens, Cache Read tokens, Total Cost, Latency, Model name, and Provider name.
Web UI (Browser Access)
Aura Workshop includes an embedded axum HTTP server that serves the full SolidJS frontend through any web browser. Port 18800 is the default.
Enabling the Web Server
- Go to Settings > Connectivity.
- Toggle Web UI Server on.
- Set the port (default: 18800).
- Optionally set a Bearer token for authentication.
- Open
http://localhost:18800in any browser on the network.
Authentication
When a Bearer token is configured, all API requests must include Authorization: Bearer your-secret-token in the headers. The /api/health and /api/heartbeat/incoming endpoints are always public and do not require authentication.
Environment Variable Configuration
export AURA_WEB_ENABLED=true
export AURA_WEB_PORT=18800
export AURA_WEB_TOKEN=your-secret-token
Feature Parity
The browser-based UI has full feature parity with the desktop app: all navigation views, real-time streaming via SSE, the complete REST API, file upload, folder selection, voice input, and all settings tabs.
Remote Agent Deployment
Deploy AI agents to remote machines via SSH for always-on, headless operation.
How Deployment Works
- Go to Settings > Agents or use the
deploy_remotetool. - Provide SSH connection details: hostname, username, and authentication method (key file, password, or saved credential).
- Select the deployment mode: Full Agent, Inference Master, Inference Worker, or Worker.
- Aura Workshop connects via SSH, detects the remote OS, auto-installs Docker if needed, pulls the Docker image, and starts the daemon container with
--net=host.
SSH Authentication Methods
| Method | Description |
|---|---|
| Key file | Path to an SSH private key file (RSA, Ed25519, etc.). The most secure and recommended method. |
| Password | Username/password authentication. Uses sshpass for non-interactive authentication. |
| Saved credential | References a credential from the encrypted credential store by ID. Keys are decrypted at deployment time. |
Deployment Modes
| Mode | Description |
|---|---|
| Full Agent | Complete daemon with tasks, inference, listeners, webhooks, and workflows. |
| Inference Master | Manages an inference cluster: models, workers, serves the inference API. |
| Inference Worker | Broadcasts on LAN, joins a master's cluster, contributes GPU resources via RPC. |
| Worker | Task execution worker that uses a master's inference endpoint for model access. |
Monitoring
- Heartbeat: Deployed agents send periodic heartbeats to the master for status monitoring. If heartbeats stop, the deployment is marked as offline.
- Pairing codes: Remote deployments can be paired with the desktop app using a one-time pairing code for secure initial connection.
- Status indicators: Each remote deployment shows its connection status (online/offline), uptime, daemon mode, and resource usage.
- Viewer access: The remote deployment serves the viewer SPA for browser-based monitoring. Access it at
http://remote-host:18800.
Deploying Automation to Remote Agents
Schedules, listeners, and webhooks can be deployed to remote machines. When you create an automation item and select a target agent, the configuration is synced to the remote deployment. Deleting an automation item also cleans up its remote deployment automatically.
Docker Daemon
The aura-daemon binary provides headless operation for Docker deployments.
Repository and Tags
- Repository:
coolkoo/aura-workshop - Tags:
daemon-latest,daemon-{version},daemon-arm64 - Platforms:
linux/amd64(with CUDA runtime),linux/arm64(with Vulkan drivers)
4 Daemon Modes
| Mode | Flag | Purpose |
|---|---|---|
| full | --mode full | Complete daemon: tasks, inference, listeners, webhooks, workflows, and full REST API |
| inference-master | --mode inference-master | Manages the inference cluster: model distribution, worker coordination, serves the inference API |
| inference-worker | --mode inference-worker | Broadcasts on LAN, joins a master's cluster, contributes GPU resources via RPC |
| worker | --mode worker | Task execution worker that uses a master's inference endpoint for model access |
Running the Daemon
docker run -d \
--name aura-daemon \
--net=host \
-v ~/.aura:/root/.aura \
-v $(pwd)/data:/data \
-e AURA_WEB_ENABLED=true \
-e AURA_WEB_PORT=18800 \
-e AURA_API_KEY=your-api-key \
-e AURA_MODEL=deepseek-chat \
-e AURA_BASE_URL=https://api.deepseek.com \
coolkoo/aura-workshop:daemon-latest \
--mode full
Environment Variables
| Variable | Description | Default |
|---|---|---|
AURA_WEB_ENABLED | Enable the embedded web server | true |
AURA_WEB_PORT | Web server port | 18800 |
AURA_WEB_TOKEN | Bearer token for API authentication | (none) |
AURA_DB_PATH | SQLite database file path | /data/aura-workshop.db |
AURA_API_KEY | API key for the LLM provider | (none) |
AURA_MODEL | Model identifier to use | (none) |
AURA_BASE_URL | Base URL for the LLM provider API | (none) |
AURA_NATIVE_MODE | Use native mode instead of Docker-in-Docker | false |
AURA_REMOTE_DEPLOYMENT | Mark as a remote deployment | false |
AURA_VIEWER_MODE | Serve the viewer SPA instead of the full UI | false |
AURA_PAIRING_CODE | One-time pairing code for desktop app connection | (none) |
AURA_HEARTBEAT_URL | URL to send heartbeats to the master instance | (none) |
AURA_DEPLOYMENT_ID | Unique identifier for this deployment | (none) |
AURA_DAEMON_MODE | Daemon operating mode | full |
AURA_RPC_PORT | RPC port for distributed inference | 50052 |
AURA_MASTER_INFERENCE_URL | URL of the master's inference API (worker mode) | (none) |
Health Check
The Docker image includes a built-in health check:
HEALTHCHECK --interval=30s --timeout=5s --retries=3
CMD curl -f http://localhost:18800/api/health || exit 1
What Is Included in the Image
aura-daemonbinary (Rust) -- the headless core of Aura Workshopaura-inferencebinary (Go + llama.cpp) -- architecture-appropriate version for local model inference- Viewer frontend SPA for browser-based monitoring and interaction
- All bundled skills across all categories (documents, Anthropic, superpowers, desktop apps, design, platform, media)
- Tesseract OCR engine for image text extraction via the
ocrtool - CUDA runtime libraries (amd64 image) or patched Mesa Vulkan drivers (arm64 image) for GPU acceleration
Volume Mounts
The recommended volume mounts for a full daemon deployment:
docker run -d \
--name aura-daemon \
--net=host \
-v ~/.aura:/root/.aura \ # User memory, roles, CLI config
-v $(pwd)/data:/data \ # Database storage
-v $(pwd)/models:/root/.cache/aura-inference/models \ # GGUF models
coolkoo/aura-workshop:daemon-latest \
--mode full
| Host Path | Container Path | Purpose |
|---|---|---|
~/.aura | /root/.aura | User memory files, custom roles, CLI configuration |
./data | /data | SQLite database, charts, generated files |
./models | /root/.cache/aura-inference/models | Downloaded GGUF model files |
GPU Passthrough
For NVIDIA GPUs on Linux, use the --gpus all flag to pass through GPU resources:
docker run -d --net=host --gpus all \
-v $(pwd)/models:/root/.cache/aura-inference/models \
-v $(pwd)/data:/data \
coolkoo/aura-workshop:daemon-latest \
--mode inference-master
Ensure the NVIDIA Container Toolkit is installed on the host. On macOS with Colima, GPU passthrough requires krunkit.
Auto Docker Installation for Remote Deployment
When deploying to remote machines via SSH, Aura Workshop can auto-install Docker:
- Linux: Uses the official
get.docker.cominstallation script. - macOS: Installs Colima via Homebrew (with optional krunkit for GPU passthrough).
- Windows: Provides manual installation instructions with links to Docker Desktop documentation.
Headless Mode
For environments without a display server (headless Linux servers), you can run Aura Workshop without a GUI:
xvfb-run aura-workshop
The application starts without rendering a window but serves the full web UI for browser-based access. For production headless deployments, the Docker daemon is the recommended approach.
Headless mode is configurable via settings: when the headless flag is enabled, the application skips all GUI initialization and runs purely as a server process. All features remain available through the REST API and web UI.
Security Architecture
Encryption at Rest
- AES-256-GCM: All credentials and API keys stored in the database are encrypted using AES-256-GCM (via the
aes-gcmRust crate). Each value has a unique nonce. - System keychain: The master encryption key is stored in the operating system's secure keychain (macOS Keychain, Windows Credential Manager, Linux secret service). It never touches the filesystem.
Authentication
- Biometric auth: Touch ID (macOS) and Windows Hello gate access to credential retrieval and sensitive settings.
- JWT sessions: The web UI uses JSON Web Tokens (via the
jsonwebtokenRust crate) for session management. - Bearer token: The REST API supports optional Bearer token authentication for remote access.
Webhook Security
Incoming webhook requests can be validated using HMAC-SHA256 signatures against a shared secret. Invalid signatures are rejected before processing.
Download Verification
Model downloads from HuggingFace are verified using SHA2 checksums to prevent tampering or corruption.
Command Safety Guardrails
The agent middleware blocks dangerous shell commands automatically and unconditionally. Fork bombs, recursive root deletion, disk formatting, and other destructive patterns are always blocked regardless of configuration. These guardrails cannot be disabled.
Zero Telemetry
Aura Workshop sends zero telemetry and includes no analytics tracking. All data remains on your machine. The application is fully functional in air-gapped environments when paired with local models.
Local-First Architecture
All data is stored in a local SQLite database. There is no cloud sync, no remote storage of your conversations, and no data leaving your machine except the LLM API calls you explicitly configure. When using local inference (Aura AI or Ollama), no data leaves your network at all.
Execution Isolation
Three isolation levels are available for agent commands:
- none: Commands run directly on the host.
- sandbox: Commands run in a restricted process sandbox.
- container: Commands run inside Docker containers with limited host access.
Chrome Extension
Aura Workshop includes a Chrome extension that provides a side panel chat interface and browser automation capabilities.
Side Panel
- Opens as a Chrome side panel with a full chat interface.
- Renders Markdown, code blocks, and inline formatting in responses.
- Connects to Aura Workshop via WebSocket at
/ws/sidepanel. - Supports streaming responses, tool usage display, and real-time updates.
- Tab-aware: the agent can see the current URL and page context of the active tab.
- File attachment support for including documents in conversations.
Context Menus
Right-click on any text, link, or page element to access context menu options for quick agent invocation. For example, you can right-click selected text and choose "Ask Aura about this" to send it directly to the agent with the page context.
Screenshot Capture
The extension can capture screenshots of the current browser tab and send them to the agent for visual analysis.
Browser Automation via Agents
Through the /ws/browser WebSocket connection, agents can control the browser programmatically using the browser_action tool. Supported actions include:
- Navigate: Open a URL in the browser
- Click: Click on a page element by selector or coordinates
- Type: Enter text into form fields
- Take screenshot: Capture the current page state as an image
- Get page content: Extract the text content of the current page
- Extract elements: Find and extract specific DOM elements by CSS selector
- Scroll: Scroll the page up, down, or to a specific element
- Wait: Wait for a specific element to appear or a condition to be met
Setup
- Load the extension from the
extension/directory in Chrome's developer mode (navigate tochrome://extensions, enable Developer mode, and click "Load unpacked"). - Click the extension icon and configure it to point to your Aura Workshop web server URL (for example,
http://localhost:18800). - If authentication is enabled on your server, enter the Bearer token in the extension settings.
- The extension icon turns green when successfully connected to the server.
Embeddable Chat Widget
Embed an Aura Workshop-powered chat interface on any website with a simple JavaScript snippet.
<script src="http://your-server:18800/widget.js"></script>
<script>
AuraWidget.init({
serverUrl: "http://your-server:18800",
token: "your-auth-token", // optional
position: "bottom-right", // bottom-right or bottom-left
title: "AI Assistant", // widget title
greeting: "How can I help?", // initial greeting message
theme: "dark" // dark or light
});
</script>
Features
- WebSocket real-time communication: Messages stream in real-time via WebSocket.
- Custom styling: Configure color scheme, position (bottom-right or bottom-left), title text, and greeting message.
- Persistent history: Conversation history persists per user across page loads.
- Markdown rendering: Responses render with full Markdown formatting including code blocks.
- Configuration: Set up the widget by creating a "chatbot" type listener in the Listeners tab.
Voice & TTS
Speech-to-Text Providers
| Provider | Description | Requirements |
|---|---|---|
| Whisper | OpenAI's Whisper model for transcription | OpenAI API key |
| System | Operating system's built-in speech recognition | None |
| Groq | Ultra-fast transcription via Groq LPU | Groq API key |
| OpenAI | OpenAI's transcription API | OpenAI API key |
| xAI | xAI's transcription service | xAI API key |
Text-to-Speech Providers
| Provider | Description | Requirements |
|---|---|---|
| System | OS built-in TTS (macOS say command, Windows SAPI) | None |
| OpenAI TTS | High-quality neural text-to-speech via OpenAI API | OpenAI API key |
| ElevenLabs | Premium voice synthesis with a wide selection of voices | ElevenLabs API key |
Voice Input Button
The Voice button in the task composer activates the microphone. A pulsing red dot appears when recording. Speak your task description and the audio is transcribed using the configured speech-to-text provider and inserted into the textarea.
Configurable Voices and Speech Rate
In Settings > Integrations, select the desired voice from the provider's available options and adjust the speech rate to your preference.
Voice Configuration Details
| Setting | Description | Options |
|---|---|---|
| TTS Enabled | Master toggle for text-to-speech on agent responses | On/Off |
| TTS Provider | Which service generates the speech audio | System, OpenAI, ElevenLabs |
| Voice | Specific voice to use (depends on provider) | Provider-specific list |
| Speech Rate | Speed of the generated speech | 0.5x to 2.0x |
| STT Provider | Which service transcribes voice input | Whisper, System, Groq, OpenAI, xAI |
When TTS is enabled, agent responses are automatically converted to audio and played through your speakers. You can also trigger speech manually for any specific message.
Cross-Session Memory
Aura Workshop maintains a persistent memory system that learns from every task and adapts to your preferences over time. Memory persists across sessions and even across application restarts.
Memory Types
| Type | Location | Scope | How Created |
|---|---|---|---|
| User memory | ~/.aura/memory/*.md | Global (all projects) | Agent saves via tool or manual creation |
| Feedback memory | memory_facts table | Global or project-scoped | LLM-extracted corrections and reinforcements |
| Project memory | .aura/memory/*.md | Per-project | Auto-extracted from file operations after task completion |
| Reference memory | task_memory table | Per-task | Handoff context between team roles and compaction summaries |
Memory Fact Categories
Facts extracted from conversations are categorized with confidence scores:
| Category | Scope | Confidence | Example |
|---|---|---|---|
preference | Global | 0.9 | "User prefers Python over JavaScript for backend" |
correction | Global | 0.95 | "User correction: use FastAPI not Flask" |
reinforcement | Global | 0.90 | "User confirmed: single bundled PR is correct" |
knowledge | Project | 0.6-0.9 | "Project uses PostgreSQL on port 5432" |
file_operation | Project | 0.7 | "File modified: src/models.py" |
context | Project | 0.6-0.8 | "API routes defined in src/routes/" |
behavior | Project | 0.7 | "Always run tests after code changes" |
goal | Project | 0.7 | "Goal is to migrate from Express to FastAPI" |
Auto-Extraction
After every task, the system runs an LLM call to extract structured facts plus pattern detection for corrections ("no", "wrong", "instead") and reinforcements ("yes", "perfect", "great"). Corrections are saved at 0.95 confidence; reinforcements at 0.90.
AURA.md -- Project Instructions
Drop an AURA.md (or CLAUDE.md) file in your project root with rules the agent must follow. The system scans for: AURA.md, .aura.md, CLAUDE.md, .claude.md, .aura/INSTRUCTIONS.md, scanning from the project root up to two parent directories. Cap: 4 KB per file, 12 KB total.
AURA.md Editor
Create and edit AURA.md from Settings > Memory > Project Instructions. The editor provides:
- 7 predefined rule presets: TypeScript Strict, Python Best Practices, Test First, Clean Git, API Design, Security, Documentation
- Directory path selector to specify where to save the file
- Monospace editor with syntax-appropriate font
- Load Existing button to read an existing AURA.md from any directory
- Save button that writes the file immediately
Git Context Injection
When a task's project is a git repository, the agent automatically receives current git state in its system prompt:
git status --short: Current uncommitted changesgit log --oneline -5: The 5 most recent commits
This helps the agent understand what you have been working on and avoid conflicting changes. Toggle this feature in Settings > General > Agent Context.
Memory Viewer
View, search, and delete learned facts in Settings > Memory > Learned Facts. Facts are color-coded by category:
- Red: Corrections (highest confidence, 0.95)
- Blue: Preferences (high confidence, 0.9)
- Green: Reinforcements (high confidence, 0.9)
- Gray: Knowledge, context, and other facts (variable confidence)
Each fact displays its content, confidence percentage, category, scope (global or project), and creation date. Delete individual facts by clicking the trash icon.
Prompt Cache Stats
The static portion of the system prompt (including memory, AURA.md, and role definitions) is cached by providers that support prompt caching (Anthropic, OpenAI, Google). View cache hit rates and token savings in Settings > Billing > Prompt Cache Stats. Higher cache hit rates mean faster response times and lower costs.
Memory Scope and Isolation
- Global facts (preference, correction, reinforcement) are available to all tasks across all projects.
- Project-scoped facts (knowledge, context, file_operation) only appear for tasks in the same project path. A React project's facts will not leak into a Python project.
Memory Injection
Before every task, the system prompt is enriched with relevant memory through a multi-layer injection process:
- File-based memories: The top 15 entries (approximately 2000 character budget) from
~/.aura/memory/(user scope) and.aura/memory/(project scope) are loaded. - Keyword-searched facts: The user's message is tokenized into keywords and searched against the
memory_factstable, filtered by project isolation rules (approximately 1500 character budget). - Recent high-confidence facts: The 5 most recent facts with confidence scores at or above 0.8 are always included regardless of keyword matching.
- Pre-compaction flush: Before context compaction occurs, file operation facts are extracted from messages that are about to be dropped, ensuring no learnings are lost during compression.
This multi-layer approach ensures that the agent always has access to the most relevant context, from explicit project rules to learned preferences and domain knowledge.
AURA.md vs Memory
Understanding when to use project instructions versus memory:
| Situation | Use | Why |
|---|---|---|
| "Every task in this project must use TypeScript strict" | AURA.md | Deterministic, version-controlled rule |
| "I prefer Python over JavaScript" | Memory | Auto-detected as preference across all projects |
| "Don't use Flask, use FastAPI" | Memory | Auto-detected as correction (0.95 confidence) |
| "All commits must follow conventional format" | AURA.md | Explicit project rule for consistency |
| "The database is PostgreSQL on port 5432" | Memory | Auto-extracted project knowledge fact |
AURA.md = rules you write explicitly. Deterministic, version-controlled, immediately effective. Memory = facts the agent learns from interactions. Probabilistic, compounds over time, adapts to corrections.
REST API Overview
Aura Workshop exposes approximately 140 REST endpoints through its embedded HTTP server (default port 18800). Every feature available in the desktop app is also accessible via HTTP. All endpoints are prefixed with /api unless otherwise noted.
http://localhost:18800/apiContent-Type:
application/json for all POST/PUT requestsAuthentication: Optional Bearer token (configured in Settings > Connectivity)
Authentication
When a token is configured, include it as a Bearer token in all requests:
curl -H "Authorization: Bearer YOUR_TOKEN" http://localhost:18800/api/tasks
If no token is configured, all API requests are allowed without authentication. The /api/health and /api/heartbeat/incoming endpoints are always public.
Error Handling
All API endpoints return standard HTTP status codes. Error responses include a JSON body with an error field:
# Error response format
{
"error": "Task not found",
"code": "NOT_FOUND"
}
# Common HTTP status codes:
# 200 OK - Request succeeded
# 201 Created - Resource created
# 400 Bad Request - Invalid request body or parameters
# 401 Unauthorized - Missing or invalid auth token
# 404 Not Found - Resource does not exist
# 429 Too Many Requests - Rate limited (forwarded from provider)
# 500 Internal Server Error - Server-side error
Pagination
List endpoints that may return large result sets support optional limit and offset query parameters:
# Get the first 10 tasks
curl -H "Authorization: Bearer $TOKEN" \
"http://localhost:18800/api/tasks?limit=10&offset=0"
# Get the next 10 tasks
curl -H "Authorization: Bearer $TOKEN" \
"http://localhost:18800/api/tasks?limit=10&offset=10"
Tasks
| Method | Endpoint | Description |
|---|---|---|
GET | /api/tasks | List all tasks with status, title, timestamps, and model information |
POST | /api/tasks | Create a new task with a message, optional file attachments, project path, model, and role |
GET | /api/tasks/{id} | Get a specific task by ID |
DELETE | /api/tasks/{id} | Delete a task and all its messages |
GET | /api/tasks/{id}/messages | Get all messages for a task |
POST | /api/tasks/{id}/messages | Send a follow-up message to an existing task |
GET | /api/tasks/interrupted | List all interrupted tasks that can be resumed |
GET | /api/tasks/{id}/files | List files created or modified by a task |
# Create a task
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"message": "Create a Python script that generates prime numbers", "project_path": "/home/user/myproject"}' \
http://localhost:18800/api/tasks
# Response
{
"id": "task_abc123",
"title": "Create a Python script that generates prime numbers",
"status": "executing",
"model": "claude-sonnet-4-20250514",
"created_at": "2026-05-21T10:30:00Z"
}
Get Task Messages
# Get all messages for a task
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/tasks/task_abc123/messages
# Response
{
"messages": [
{
"id": "msg_001",
"role": "user",
"content": "Create a Python script that generates prime numbers",
"timestamp": "2026-05-21T10:30:00Z"
},
{
"id": "msg_002",
"role": "assistant",
"content": "I'll create a Python script...",
"tool_calls": [
{
"tool": "write_file",
"input": {"path": "primes.py", "content": "..."},
"output": "File written successfully",
"status": "ok"
}
],
"timestamp": "2026-05-21T10:30:15Z"
}
]
}
Send a Follow-Up Message
# Send a follow-up message to an existing task
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"message": "Also add a function to check if a number is prime"}' \
http://localhost:18800/api/tasks/task_abc123/messages
List Interrupted Tasks
# Get all tasks that can be resumed
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/tasks/interrupted
# Response
{
"tasks": [
{
"id": "task_xyz789",
"title": "Build a REST API",
"status": "interrupted",
"interrupted_at": "2026-05-21T09:15:00Z"
}
]
}
Conversations and Messages
| Method | Endpoint | Description |
|---|---|---|
GET | /api/conversations | List all conversations |
POST | /api/conversations | Create a new conversation |
DELETE | /api/conversations/{id} | Delete a conversation and its messages |
PUT | /api/conversations/{id}/title | Update a conversation title |
GET | /api/conversations/{id}/messages | Get all messages in a conversation |
POST | /api/conversations/{id}/messages | Add a message to a conversation |
Chat and Agent (SSE Streaming)
These endpoints return Server-Sent Events (SSE) streams for real-time output.
| Method | Endpoint | Description |
|---|---|---|
POST | /api/chat/send | Send a chat message and stream the response (no tool access) |
POST | /api/chat/enhanced | Send a chat message with full tool access (streaming) |
POST | /api/agent/run | Run a standalone agent task with full tool access (streaming) |
POST | /api/tasks/{id}/run | Run an existing task (streaming) |
POST | /api/tasks/{id}/resume | Resume an interrupted task (streaming) |
GET | /api/events | Global SSE event stream for all task, workflow, and system events |
POST | /api/inference/stop | Stop inference for a specific task (task ID in request body) |
# Run an agent with SSE streaming
curl -N -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"message": "What is the capital of France?"}' \
http://localhost:18800/api/agent/run
# SSE events received:
# data: {"type":"text","content":"The capital of France is Paris."}
# data: {"type":"done","status":"completed"}
The SSE stream emits events with types: text, tool_use, tool_result, thinking, plan, status, error, and done.
Platform and Settings
| Method | Endpoint | Description |
|---|---|---|
GET | /api/platform | Returns the platform identifier (darwin, windows, linux) |
GET | /api/settings | Get all application settings as a JSON object |
PUT | /api/settings | Save application settings |
POST | /api/settings/test | Test LLM connection with current settings |
POST | /api/email/test | Send a test email to verify email configuration |
GET | /api/auth/check | Verify that the provided auth token is valid |
Listeners
| Method | Endpoint | Description |
|---|---|---|
GET | /api/listeners | List all listeners |
POST | /api/listeners | Create a new listener |
PUT | /api/listeners/{id} | Update a listener |
DELETE | /api/listeners/{id} | Delete a listener |
POST | /api/listeners/{id}/start | Start a listener |
POST | /api/listeners/{id}/stop | Stop a listener |
POST | /api/listeners/{id}/toggle | Toggle a listener enabled/disabled |
GET | /api/listeners/{id}/logs | Get event logs for a listener |
GET | /api/listeners/statuses | Get running/stopped status of all listeners |
GET | /api/listeners/platforms | Get the list of supported listener platforms |
# Create a Slack listener
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Slack Support Bot",
"platform": "slack",
"token": "xoxb-...",
"trigger_type": "mentions",
"prompt": "You are a helpful support agent. Answer the user question based on our documentation.",
"model": "claude-sonnet-4-20250514",
"enabled": true
}' \
http://localhost:18800/api/listeners
# Response
{
"id": "listener_abc123",
"name": "Slack Support Bot",
"platform": "slack",
"status": "stopped",
"created_at": "2026-05-21T10:00:00Z"
}
# Start the listener
curl -X POST -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/listeners/listener_abc123/start
# Get event logs
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/listeners/listener_abc123/logs
# Response
{
"events": [
{
"timestamp": "2026-05-21T10:05:00Z",
"sender": "alice",
"message": "@bot How do I reset my password?",
"response": "To reset your password, go to Settings > Account...",
"channel": "#support"
}
]
}
# Get all supported platforms
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/listeners/platforms
Schedules
| Method | Endpoint | Description |
|---|---|---|
GET | /api/schedules | List all scheduled tasks |
POST | /api/schedules | Create a new scheduled task |
PUT | /api/schedules/{id} | Update a scheduled task |
DELETE | /api/schedules/{id} | Delete a scheduled task |
POST | /api/schedules/{id}/toggle | Toggle a scheduled task enabled/disabled |
# Create a daily schedule
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Daily Report",
"schedule_type": "daily",
"time": "09:00",
"message": "Generate a summary of yesterday git commits",
"duration_type": "forever"
}' \
http://localhost:18800/api/schedules
Webhooks
| Method | Endpoint | Description |
|---|---|---|
GET | /api/webhooks | List all webhooks |
POST | /api/webhooks | Create a new webhook |
PUT | /api/webhooks/{id} | Update a webhook |
DELETE | /api/webhooks/{id} | Delete a webhook |
POST | /api/webhooks/{id}/toggle | Toggle a webhook enabled/disabled |
GET | /api/webhooks/{id}/url | Get the auto-generated URL for a webhook |
GET | /api/webhooks/{id}/logs | Get invocation logs for a webhook |
# Create a webhook
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "GitHub PR Webhook",
"prompt": "A GitHub pull request event was received: {{payload}}. Review the PR changes and provide feedback.",
"secret": "my-hmac-secret",
"model": "claude-sonnet-4-20250514"
}' \
http://localhost:18800/api/webhooks
# Response
{
"id": "webhook_abc123",
"name": "GitHub PR Webhook",
"url": "http://localhost:18790/webhooks/webhook_abc123",
"enabled": true,
"created_at": "2026-05-21T10:00:00Z"
}
# Get the webhook URL
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/webhooks/webhook_abc123/url
# Test the webhook with cURL
curl -X POST http://localhost:18790/webhooks/webhook_abc123 \
-H "Content-Type: application/json" \
-H "X-Hub-Signature-256: sha256=..." \
-d '{"action":"opened","pull_request":{"title":"Add user auth","number":42}}'
Workflow API Examples
# Create a workflow
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Code Review Pipeline",
"nodes": [
{
"id": "n1",
"type": "agent_task",
"prompt": "Analyze the code for bugs and security issues",
"model": "claude-sonnet-4-20250514"
},
{
"id": "n2",
"type": "human_loop",
"prompt": "Review the analysis and approve or reject"
},
{
"id": "n3",
"type": "agent_task",
"prompt": "Generate a fix for the identified issues"
}
],
"edges": [
{"from": "n1", "to": "n2", "port": "pass"},
{"from": "n2", "to": "n3", "port": "approve"}
]
}' \
http://localhost:18800/api/workflows
# Run a workflow
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"input": {"repo": "my-org/my-repo", "branch": "feature/auth"}}' \
http://localhost:18800/api/workflows/workflow_abc123/run
# Response
{
"run_id": "run_xyz789",
"workflow_id": "workflow_abc123",
"status": "running",
"started_at": "2026-05-21T10:00:00Z"
}
# Check workflow run status
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/workflow/runs/run_xyz789
# Approve a human loop step
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"action": "approve", "comment": "Looks good, proceed with the fix"}' \
http://localhost:18800/api/workflow/approvals/approval_abc/resolve
Skills and Role Skills
| Method | Endpoint | Description |
|---|---|---|
GET | /api/skills | List all installed skills |
DELETE | /api/skills/{name} | Delete a skill by name |
GET | /api/skills/{name}/content | Get the content/prompt of a skill |
PUT | /api/skills/{name}/content | Update a skill's content |
GET | /api/skill-settings | Get skill settings |
PUT | /api/skill-settings | Update skill settings |
GET | /api/role-skills | List all role skills |
POST | /api/role-skills | Save (create or update) a role skill |
GET | /api/role-skills/{name} | Get a specific role skill by name |
DELETE | /api/role-skills/{name} | Delete a role skill |
MCP Servers
| Method | Endpoint | Description |
|---|---|---|
GET | /api/mcp/servers | List all configured MCP servers |
POST | /api/mcp/servers | Save (create or update) an MCP server configuration |
DELETE | /api/mcp/servers/{id} | Delete an MCP server configuration |
POST | /api/mcp/servers/{id}/connect | Connect to an MCP server |
POST | /api/mcp/servers/{id}/disconnect | Disconnect from an MCP server |
GET | /api/mcp/statuses | Get connection status for all MCP servers |
GET | /api/mcp/tools | List all tools from connected MCP servers |
# Add an MCP server (stdio transport)
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "File System MCP",
"transport": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/projects"],
"env": {},
"isolation": "shared"
}' \
http://localhost:18800/api/mcp/servers
# Add an MCP server (HTTP transport with OAuth)
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Cloud MCP",
"transport": "http",
"url": "https://mcp.example.com/sse",
"oauth": {
"client_id": "my-client-id",
"client_secret": "my-client-secret"
},
"isolation": "per_task"
}' \
http://localhost:18800/api/mcp/servers
# Connect to an MCP server
curl -X POST -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/mcp/servers/mcp_abc123/connect
# List all available MCP tools
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/mcp/tools
# Response
{
"tools": [
{
"name": "mcp_filesystem_read_file",
"server": "File System MCP",
"description": "Read a file from the filesystem",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string", "description": "File path to read"}
},
"required": ["path"]
}
}
]
}
# Get connection statuses
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/mcp/statuses
# Response
{
"statuses": {
"mcp_abc123": {"connected": true, "tool_count": 5},
"mcp_def456": {"connected": false, "error": "Connection refused"}
}
}
Plugins
| Method | Endpoint | Description |
|---|---|---|
GET | /api/plugins | List all installed plugins |
POST | /api/plugins | Install or update a plugin |
DELETE | /api/plugins/{id} | Uninstall a plugin |
POST | /api/plugins/{id}/toggle | Enable or disable a plugin |
Workflows
| Method | Endpoint | Description |
|---|---|---|
GET | /api/workflows | List all automation workflows |
POST | /api/workflows | Create a new workflow |
GET | /api/workflows/{id} | Get a specific workflow |
PUT | /api/workflows/{id} | Update a workflow |
DELETE | /api/workflows/{id} | Delete a workflow |
POST | /api/workflows/{id}/run | Run a workflow |
GET | /api/workflow/runs/{run_id} | Get the status of a workflow run |
POST | /api/workflow/approvals/{id}/resolve | Resolve a human approval request (approve or reject) |
Teams
| Method | Endpoint | Description |
|---|---|---|
GET | /api/teams | List all teams |
POST | /api/teams | Create a new team |
PUT | /api/teams/{id} | Update a team |
DELETE | /api/teams/{id} | Delete a team |
POST | /api/teams/run | Run a team task |
# Run a team task
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"team_id": "software-dev-team",
"message": "Build a REST API for user management with authentication"
}' \
http://localhost:18800/api/teams/run
Credentials
| Method | Endpoint | Description |
|---|---|---|
GET | /api/credentials | List all stored credentials (metadata only, values not returned) |
POST | /api/credentials | Save a new credential (value is encrypted before storage) |
GET | /api/credentials/{id} | Get a credential with decrypted value (biometric auth gated) |
DELETE | /api/credentials/{id} | Delete a credential |
# Save a credential
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Slack Bot Token",
"type": "token",
"value": "xoxb-1234567890-abcdefghij"
}' \
http://localhost:18800/api/credentials
# Response
{
"id": "cred_abc123",
"name": "Slack Bot Token",
"type": "token",
"created_at": "2026-05-21T10:00:00Z"
}
# List credentials (values are never included in list responses)
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/credentials
# Response
{
"credentials": [
{"id": "cred_abc123", "name": "Slack Bot Token", "type": "token", "created_at": "2026-05-21T10:00:00Z"},
{"id": "cred_def456", "name": "SSH Deploy Key", "type": "ssh_key", "created_at": "2026-05-20T14:00:00Z"}
]
}
# Get decrypted value (requires biometric auth on desktop)
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/credentials/cred_abc123
Cloud Storage
| Method | Endpoint | Description |
|---|---|---|
GET | /api/cloud/connectors | List cloud storage connectors |
POST | /api/cloud/connectors | Save (create or update) a cloud connector |
DELETE | /api/cloud/connectors/{id} | Delete a cloud connector |
# Save a cloud storage connector
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Production S3",
"type": "s3",
"config": {
"bucket": "my-aura-files",
"region": "us-east-1",
"access_key": "AKIAIOSFODNN7EXAMPLE",
"secret_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
}
}' \
http://localhost:18800/api/cloud/connectors
# List connectors
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/cloud/connectors
# Response
{
"connectors": [
{
"id": "cloud_abc123",
"name": "Production S3",
"type": "s3",
"status": "connected"
}
]
}
Aura AI and Models
| Method | Endpoint | Description |
|---|---|---|
GET | /api/aura/status | Get the current status of the local inference server |
GET | /api/aura/models/hf | Scan for HuggingFace-cached models |
GET | /api/aura/models/gguf | Scan for local GGUF model files |
GET | /api/aura/models/curated | Get the curated model list (recommended downloads) |
POST | /api/aura/models/download | Start downloading a model from HuggingFace |
POST | /api/aura/models/download/cancel | Cancel an in-progress model download |
POST | /api/aura/models/delete | Delete a downloaded local model |
GET | /api/aura/lan-nodes | List discovered LAN inference nodes |
# Check inference server status
curl -H "Authorization: Bearer $TOKEN" http://localhost:18800/api/aura/status
# Response
{
"running": true,
"model": "qwen3-8b-q4_k_m.gguf",
"port": 8080,
"gpu_layers": -1
}
Inference Cluster
| Method | Endpoint | Description |
|---|---|---|
GET | /api/cluster/status | Get cluster status (loaded model, workers, running state) |
GET | /api/cluster/workers | List discovered and claimed workers |
POST | /api/cluster/workers/{id}/add | Claim a discovered worker (master-side) |
POST | /api/cluster/workers/{id}/remove | Disconnect a claimed worker (master-side) |
POST | /api/cluster/inference/start | Start distributed inference with full parameters |
POST | /api/cluster/inference/stop | Stop distributed inference |
POST | /api/cluster/join | Accept a master's claim request (worker-side) |
POST | /api/cluster/leave | Leave the cluster (worker-side) |
GET | /api/cluster/worker/status | Report worker status (worker-side) |
Billing and Spend
| Method | Endpoint | Description |
|---|---|---|
GET | /api/billing/summary | Get usage summary grouped by provider |
GET | /api/billing/limits | Get spend limits for all providers |
POST | /api/billing/limits | Save a spend limit for a provider |
GET | /api/billing/fallback-order | Get the provider fallback order |
POST | /api/billing/fallback-order | Save the provider fallback order |
GET | /api/billing/pricing | Get model pricing table |
POST | /api/billing/pricing | Save model pricing entries |
POST | /api/billing/reset | Reset all usage tracking data |
GET | /api/billing/daily | Get daily usage statistics |
GET | /api/billing/daily-by-model | Get daily usage broken down by model |
GET | /api/routing/stats | Get smart routing statistics (tasks per tier, cost savings) |
# Get billing summary
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/billing/summary
# Response
{
"today": {
"total_cost": 2.45,
"total_input_tokens": 125000,
"total_output_tokens": 45000
},
"month": {
"total_cost": 38.72,
"by_provider": {
"anthropic": {"cost": 28.50, "input_tokens": 1200000, "output_tokens": 450000},
"openai": {"cost": 8.22, "input_tokens": 500000, "output_tokens": 200000},
"groq": {"cost": 2.00, "input_tokens": 800000, "output_tokens": 300000}
}
}
}
# Set a spend limit
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"provider": "anthropic", "monthly_limit": 50.00, "alert_threshold": 40.00}' \
http://localhost:18800/api/billing/limits
# Set provider fallback order
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"order": ["anthropic", "openai", "groq", "deepseek"]}' \
http://localhost:18800/api/billing/fallback-order
# Get daily usage
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/billing/daily
# Response
{
"daily": [
{"date": "2026-05-21", "cost": 2.45, "requests": 15},
{"date": "2026-05-20", "cost": 3.12, "requests": 22},
{"date": "2026-05-19", "cost": 1.87, "requests": 10}
]
}
# Get smart routing statistics
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/routing/stats
# Response
{
"total_tasks": 150,
"by_tier": {
"simple": {"count": 60, "actual_cost": 1.20, "hypothetical_cost": 12.00},
"standard": {"count": 50, "actual_cost": 8.50, "hypothetical_cost": 15.00},
"complex": {"count": 30, "actual_cost": 15.00, "hypothetical_cost": 18.00},
"reasoning": {"count": 10, "actual_cost": 8.00, "hypothetical_cost": 8.00}
},
"total_savings": 19.30
}
Voice
| Method | Endpoint | Description |
|---|---|---|
GET | /api/voice/voices | List available TTS voices for the current provider |
POST | /api/voice/transcribe | Transcribe an audio file to text (multipart form data) |
POST | /api/voice/save-temp | Save a temporary audio file from a recording |
POST | /api/voice/speak | Convert text to speech and return audio data |
# Transcribe audio
curl -X POST -H "Authorization: Bearer $TOKEN" \
-F "[email protected]" \
http://localhost:18800/api/voice/transcribe
# Response
{"text": "Create a REST API for managing user accounts"}
Files
| Method | Endpoint | Description |
|---|---|---|
GET | /api/files | Download a file by path (query parameter: path) |
POST | /api/files/upload | Upload a file (multipart form data) |
GET | /api/files/list | List files in a directory |
Data Management
| Method | Endpoint | Description |
|---|---|---|
POST | /api/data/clear-history | Clear all conversation and task history |
POST | /api/data/reset-keys | Reset all stored API keys |
POST | /api/data/reset-database | Reset the database to factory defaults |
POST | /api/data/clear-model-cache | Clear cached model metadata |
POST | /api/data/reset-all | Reset all application data |
Memory
| Method | Endpoint | Description |
|---|---|---|
GET | /api/memories | List all saved agent memories |
POST | /api/memories | Save a new memory entry |
DELETE | /api/memories/{name} | Delete a specific memory entry |
# List all memories
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/memories
# Response
{
"memories": [
{
"name": "user-preferences.md",
"type": "user",
"content": "# User Preferences\n- Prefers Python over JavaScript\n- Uses FastAPI for APIs\n- Follows conventional commits",
"scope": "global",
"updated_at": "2026-05-21T09:00:00Z"
},
{
"name": "project-context.md",
"type": "project",
"content": "# Project Context\n- Uses PostgreSQL on port 5432\n- Test framework: pytest",
"scope": "project",
"project_path": "/home/user/myproject",
"updated_at": "2026-05-20T15:00:00Z"
}
]
}
# Save a new memory
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "team-conventions.md",
"type": "reference",
"content": "# Team Conventions\n- All PRs require two approvals\n- Use squash merge only"
}' \
http://localhost:18800/api/memories
# Delete a memory
curl -X DELETE -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/memories/team-conventions.md
Projects
| Method | Endpoint | Description |
|---|---|---|
GET | /api/projects | List all projects |
POST | /api/projects | Create a new project |
GET | /api/projects/{id} | Get a specific project |
PUT | /api/projects/{id} | Update a project |
DELETE | /api/projects/{id} | Delete a project |
GET | /api/projects/{id}/tasks | List tasks associated with a project |
GET | /api/projects/{id}/memory | Get project-scoped memory facts |
# Create a project
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "My Web App",
"path": "/home/user/projects/webapp"
}' \
http://localhost:18800/api/projects
# Response
{
"id": "proj_abc123",
"name": "My Web App",
"path": "/home/user/projects/webapp",
"task_count": 0,
"created_at": "2026-05-21T10:00:00Z"
}
# List tasks for a project
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/projects/proj_abc123/tasks
# Get project-scoped memory facts
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/projects/proj_abc123/memory
# Response
{
"facts": [
{
"category": "knowledge",
"content": "Project uses PostgreSQL on port 5432",
"confidence": 0.85,
"created_at": "2026-05-20T14:00:00Z"
},
{
"category": "file_operation",
"content": "File modified: src/models/user.py - added email validation",
"confidence": 0.7,
"created_at": "2026-05-21T09:30:00Z"
}
]
}
Viewer and Remote Deployment
| Method | Endpoint | Description |
|---|---|---|
GET | /api/viewer/status | Get deployment status including daemon_mode and uptime |
GET | /api/viewer/items | Get deployed items (schedules, listeners, webhooks) |
GET | /api/viewer/tasks | Get recent tasks on the remote deployment |
GET | /api/viewer/tasks/{id}/messages | Get task messages for a remote task |
GET | /api/viewer/events | SSE event stream for real-time remote monitoring |
GET | /api/remote-deployments | List all remote agent deployments |
# Get remote deployment status
curl -H "Authorization: Bearer $TOKEN" \
http://remote-host:18800/api/viewer/status
# Response
{
"daemon_mode": "full",
"uptime_seconds": 86400,
"version": "1.30.1",
"platform": "linux",
"active_tasks": 2,
"active_listeners": 5,
"active_schedules": 3
}
# List deployed items on a remote agent
curl -H "Authorization: Bearer $TOKEN" \
http://remote-host:18800/api/viewer/items
# Response
{
"schedules": [
{"id": "sched_001", "name": "Daily Report", "enabled": true, "next_run": "2026-05-22T09:00:00Z"}
],
"listeners": [
{"id": "listen_001", "name": "Slack Bot", "platform": "slack", "status": "online"}
],
"webhooks": [
{"id": "hook_001", "name": "GitHub Hook", "enabled": true}
]
}
# List all remote deployments from the master
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/remote-deployments
# Response
{
"deployments": [
{
"id": "deploy_abc123",
"hostname": "gpu-server-1.local",
"mode": "full",
"status": "online",
"last_heartbeat": "2026-05-21T10:29:30Z",
"uptime_seconds": 172800
},
{
"id": "deploy_def456",
"hostname": "inference-node-2.local",
"mode": "inference-worker",
"status": "online",
"last_heartbeat": "2026-05-21T10:29:45Z",
"uptime_seconds": 86400
}
]
}
Slash Commands
| Method | Endpoint | Description |
|---|---|---|
GET | /api/slash-commands | List all custom slash commands |
POST | /api/slash-commands | Create a new slash command |
GET | /api/slash-commands/{id} | Get a specific slash command |
PUT | /api/slash-commands/{id} | Update a slash command |
DELETE | /api/slash-commands/{id} | Delete a slash command |
Custom Providers
| Method | Endpoint | Description |
|---|---|---|
GET | /api/providers/custom | List custom providers |
POST | /api/providers/custom | Save (create or update) a custom provider |
DELETE | /api/providers/custom/{id} | Delete a custom provider |
# Add a custom provider
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "My vLLM Server",
"base_url": "https://my-vllm.example.com/v1",
"api_key": "my-api-key",
"api_format": "openai",
"auth_type": "bearer"
}' \
http://localhost:18800/api/providers/custom
# List custom providers
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/providers/custom
# Create a slash command
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "summarize",
"description": "Summarize the current conversation or document",
"prompt": "Summarize the following content in 3-5 bullet points: {{input}}",
"handler_type": "agent_task"
}' \
http://localhost:18800/api/slash-commands
# List slash commands
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:18800/api/slash-commands
# Response
{
"commands": [
{
"id": "cmd_abc123",
"name": "summarize",
"description": "Summarize the current conversation or document",
"handler_type": "agent_task",
"enabled": true
}
]
}
Other Endpoints
| Method | Endpoint | Description |
|---|---|---|
GET | /api/environment | Get system environment information (OS, architecture, memory, etc.) |
GET | /api/diagnostics | Run system diagnostics and return results |
GET | /api/web-server/status | Get the web server's current status and configuration |
GET | /api/health | Health check (always public, returns {"status":"ok"}) |
POST | /api/heartbeat/incoming | Receive heartbeat from remote deployment nodes (always public) |
GET | /api/image-proxy | Proxy external images (for PDF export and CORS bypass) |
GET | /api/dependencies/check | Check status of system dependencies (Docker, Node.js, Python, etc.) |
GET | /api/design-systems | List available design system tokens and themes |
GET | /charts/{filename} | Serve generated chart image files |
WebSocket Endpoints
| Path | Description |
|---|---|
/ws/browser | Chrome extension browser automation commands and responses |
/ws/sidepanel | Chrome extension side panel chat with real-time streaming |
API Compatibility Layer
The aura-inference local inference server exposes multiple compatibility APIs, allowing it to act as a drop-in replacement for popular inference services.
OpenAI-Compatible API
| Method | Path | Description |
|---|---|---|
POST | /v1/chat/completions | Chat completion (streaming and non-streaming) |
POST | /v1/completions | Text completion |
POST | /v1/embeddings | Text embeddings |
GET | /v1/models | List available models |
# Chat completion (OpenAI format)
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "local-model",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": true
}'
Anthropic-Compatible API
| Method | Path | Description |
|---|---|---|
POST | /v1/messages | Anthropic Messages API format |
Ollama-Compatible API
| Method | Path | Description |
|---|---|---|
POST | /api/chat | Ollama chat format |
POST | /api/generate | Ollama generate format |
GET | /api/tags | List available models (Ollama format) |
GET | /api/ps | List running models |
POST | /api/show | Show model information |
API Gateway
Aura Workshop also exposes an API gateway that allows external tools to use it as a proxy for LLM requests. Point any OpenAI or Anthropic SDK at your Aura Workshop instance and it will route requests to whatever provider and model you have configured, applying spend limits, fallback, and logging automatically. This is particularly useful for:
- Centralizing LLM access through a single endpoint for your team
- Applying spend limits and usage tracking to all requests
- Using local models with tools that only support OpenAI/Anthropic APIs
- Testing different models by changing the server-side configuration without modifying client code
| Method | Path | Description |
|---|---|---|
POST | /v1/chat/completions | OpenAI-compatible chat completion (proxied to configured provider) |
POST | /v1/messages | Anthropic-compatible messages (proxied to configured provider) |
GET | /v1/models | List available models from all configured providers |
License
| Method | Endpoint | Description |
|---|---|---|
POST | /api/license/validate | Validate and activate a license key |
GET | /api/license/status | Get the current license status and tier |
aura-cli (Go TUI)
Aura Workshop includes a standalone Go binary CLI (aura-cli v1.0.0) that provides a terminal user interface built with Bubbletea. The CLI connects to an Aura Workshop server instance for inference.
Usage
aura-cli [flags]
Flags
| Flag | Long Form | Description | Default |
|---|---|---|---|
-s | --server | Server URL to connect to | http://localhost:18800 |
-m | --model | Model to use for inference | (server default) |
-t | --token | Authentication token for the server | (none) |
--password | Password for authentication | (none) | |
-p | --project | Project path to attach as the working directory | (current directory) |
--chat | Chat mode: no tool access, conversation only | (agent mode) | |
-v | --verbose | Verbose output for debugging | off |
--version | Print version information and exit |
Modes
- Agent mode (default): Full tool access. The agent can read/write files, run commands, search the web, and use all configured tools.
- Chat mode (
--chat): Conversational only. No tools are available. Ideal for quick Q&A without giving the agent access to your system.
Pipe Mode
The CLI supports piping input via stdin for non-interactive use. This is useful for scripting and CI/CD integration:
# Pipe a prompt directly
echo "Explain this error" | aura-cli -s http://my-server:18800 -t my-token
# Pipe a file as context
cat error.log | aura-cli -m claude-sonnet-4-20250514 -t my-token
# Use in shell scripts
git diff HEAD~1 | aura-cli --chat -m claude-sonnet-4-20250514 -t my-token \
-s http://my-server:18800
In pipe mode, the CLI reads stdin until EOF, sends the content as the user message, and outputs the agent's response to stdout. This makes it easy to integrate Aura Workshop into existing command-line workflows and automation scripts.
Configuration File
Default settings can be stored in ~/.aura/config.json:
{
"server": "http://localhost:18800",
"token": "my-auth-token",
"model": "claude-sonnet-4-20250514"
}
Examples
# Connect to a remote server with auth
aura-cli -s http://my-server:18800 -t my-secret-token
# Use a specific model in chat mode
aura-cli -m claude-sonnet-4-20250514 --chat
# Attach a project directory
aura-cli -p /path/to/my/project
aura-inference CLI
The aura-inference binary can be used standalone from the command line for local model serving.
Serve Mode
aura-inference serve --model <path-to-model.gguf> --port 8080 [options]
| Option | Description | Default |
|---|---|---|
--model | Path to the GGUF model file | (required) |
--port | HTTP server port | 8080 |
--gpu-layers | Number of layers to offload to GPU (-1 = all) | -1 |
--quantization | Quantization type (Q4_K_M, Q5_K_M, Q8_0, F16) | Q4_K_M |
--ctx-size | Context window size in tokens | 4096 |
--batch-size | Inference batch size | 512 |
--flash-attn | Enable flash attention | disabled |
--cache-type-k | KV cache key type (q8_0, f16) | q8_0 |
--cache-type-v | KV cache value type (q8_0, f16) | q8_0 |
--thinking | Enable thinking/reasoning mode | disabled |
--rpc-workers | Comma-separated list of RPC workers (host:port) | (none) |
aura-inference serve \
--model ~/.cache/aura-inference/models/qwen3-8b-q4_k_m.gguf \
--port 8080 \
--gpu-layers -1 \
--ctx-size 8192 \
--batch-size 1024 \
--flash-attn \
--thinking
RPC Worker Mode
aura-inference rpc --host 0.0.0.0 --port 50052
Starts the binary in RPC worker mode, contributing GPU resources to a master node. The master specifies workers via the --rpc-workers flag with a comma-separated list of host:port pairs:
# Master with two RPC workers
aura-inference serve \
--model ~/.cache/aura-inference/models/llama-3.3-70b-q4_k_m.gguf \
--port 8080 \
--gpu-layers -1 \
--rpc-workers worker1.local:50052,worker2.local:50052
Each worker contributes its GPU VRAM to the inference cluster. The master distributes model layers proportionally across all available GPUs (local plus remote workers). This enables running models that are too large for any single GPU.
aura-daemon CLI
The daemon binary is used inside Docker containers for headless operation.
aura-daemon --mode <mode>
| Mode | Description |
|---|---|
full | Complete daemon with all features enabled |
inference-master | Inference cluster master node |
inference-worker | Inference cluster worker node |
worker | Task execution worker |
The daemon reads all its configuration from environment variables. See Docker Daemon for the full environment variable reference.
Database Schema
Aura Workshop uses a single SQLite database in WAL (Write-Ahead Logging) mode with 35+ tables. The schema is created and migrated automatically on launch.
| Table | Purpose |
|---|---|
settings | Application configuration stored as key-value pairs |
conversations | Chat conversation metadata (title, timestamps) |
messages | Individual messages within conversations (role, content, timestamps) |
tasks | Agent task metadata: status, model, project path, role, timestamps, classification |
task_messages | Messages within agent tasks including tool calls and results |
task_memory | Role handoff data, compaction summaries, and checkpoint state |
memory_facts | LLM-extracted structured facts with category, confidence, scope, and content |
scheduled_tasks | Schedule definitions: type, time, cron expression, duration, enabled state |
listeners | Listener configurations: platform, token, trigger type, rules, prompt |
listener_events | Event logs for listener activity: timestamp, message, user, response |
webhooks | Webhook endpoint configurations: name, secret, prompt, model |
webhook_logs | Invocation logs for webhooks: timestamp, request body, response |
workflows | DAG workflow definitions: nodes, edges, metadata |
workflow_runs | Workflow execution state: per-step status, timestamps, data context |
mcp_servers | MCP server configurations: name, transport, command/URL, isolation mode |
credentials | AES-256-GCM encrypted credential values with name and type metadata |
credential_pool | Shared credential pools for team-level credential sharing |
custom_providers | User-defined LLM provider endpoints: name, URL, API format, auth type |
spend_limits | Per-provider monthly spend caps |
token_usage | Per-request token usage and cost tracking: model, provider, input/output tokens, cost |
cloud_connectors | AWS S3, GCS, Azure Blob connector configurations |
teams | Multi-agent team definitions: name, roles, workflow type |
skill_settings | Per-skill configuration overrides |
projects | Project definitions: name, path, metadata |
plugins | Plugin configurations: name, version, enabled state, settings |
slash_commands | Custom slash command definitions: name, description, prompt, handler type |
design_systems | Design system token definitions for design skills |
remote_deployments | Remote agent deployment records: host, mode, status, pairing code |
task_files | Files created or modified by agent tasks: path, operation, task ID |
routing_logs | Smart routing decision logs: task ID, tier, model, score |
provider_health | Provider health tracking: latency, error count, circuit breaker state |
ollama_models | Cached Ollama model metadata |
aura_models | Local GGUF model metadata from scanning |
model_pricing | Input/output pricing per million tokens for each model |
fallback_order | Provider fallback priority sequence |
Network Ports Reference
| Port | Protocol | Service | Notes |
|---|---|---|---|
| 18800 | TCP | Web UI + REST API | Main application port. Serves the SolidJS frontend and all REST API endpoints. |
| 18801 | UDP | LAN inference discovery | Broadcast/listen for inference cluster nodes. 30-second interval, 90-second expiry. |
| 18790 | TCP | Webhook receiver | Incoming webhook payloads from external services. |
| 8080 | TCP | Aura AI local inference | Local GGUF model inference server. Configurable port. |
| 11434 | TCP | Ollama | Default Ollama server port. Auto-detected by Aura Workshop. |
| 50052 | TCP | RPC (distributed inference) | Worker-to-master RPC for distributed model layer inference. |
| 1420 | TCP | Vite dev server | Development only. Used when running the frontend in development mode. |
Data Storage Locations
| Data | Location |
|---|---|
| Database (macOS) | ~/Library/Application Support/aura-workshop/aura-workshop.db |
| Database (Linux) | ~/.local/share/aura-workshop/aura-workshop.db |
| Database (Windows) | %APPDATA%\aura-workshop\aura-workshop.db |
| Database (Docker) | /data/aura-workshop.db |
| User memory | ~/.aura/memory/ |
| Custom roles | ~/.aura/roles/ |
| GGUF models | ~/.cache/aura-inference/models/ |
| HuggingFace cache | ~/.cache/huggingface/hub/ |
| Project memory | .aura/memory/ (relative to project root) |
| Project instructions | AURA.md or CLAUDE.md (in project root) |
| CLI config | ~/.aura/config.json |
| Skills | Bundled in app resources; custom skills in the database |
License Tiers
| Tier | Features |
|---|---|
| Community | Core features: single-agent tasks, basic automation (schedules, listeners, webhooks), local inference via Aura AI and Ollama, all built-in tools, all skills, web UI access. |
| Professional | Everything in Community plus: multi-agent teams, advanced automation rules, all 31 listener platforms, smart routing, spend tracking and limits. |
| Business | Everything in Professional plus: visual DAG workflow editor, remote agent deployment, inference cluster management, advanced merge strategies. |
| Enterprise | Everything in Business plus: credential pools, priority support, custom integrations, SLA guarantees. |
The application works fully in Community mode without a license key. Enter a license key in Settings > Security to unlock higher tiers.
License Activation
To activate a license:
- Go to Settings > Security.
- Enter your license key in the License Key field.
- Click Activate. The system validates the key against the license server.
- On successful validation, the license tier and expiration date are displayed.
- Features for the new tier become available immediately without restart.
License validation can also be performed via the REST API: POST /api/license/validate with the key in the request body. Check current license status at any time with GET /api/license/status.
Feature Comparison
| Feature | Community | Professional | Business | Enterprise |
|---|---|---|---|---|
| Single-agent tasks | Yes | Yes | Yes | Yes |
| Built-in tools (50+) | Yes | Yes | Yes | Yes |
| All skills (100+) | Yes | Yes | Yes | Yes |
| Local inference (Aura AI) | Yes | Yes | Yes | Yes |
| Web UI access | Yes | Yes | Yes | Yes |
| Basic automation | Yes | Yes | Yes | Yes |
| Multi-agent teams | -- | Yes | Yes | Yes |
| Smart routing | -- | Yes | Yes | Yes |
| All 31 listener platforms | -- | Yes | Yes | Yes |
| Spend tracking & limits | -- | Yes | Yes | Yes |
| Visual DAG workflows | -- | -- | Yes | Yes |
| Remote deployment | -- | -- | Yes | Yes |
| Inference cluster | -- | -- | Yes | Yes |
| Credential pools | -- | -- | -- | Yes |
| Priority support | -- | -- | -- | Yes |
Troubleshooting & FAQ
macOS quarantine prevents launch
After installing from DMG, macOS may quarantine the application. You will see a message like "Aura Workshop cannot be opened because it is from an unidentified developer." Open Terminal and run:
xattr -cr /Applications/Aura\ Workshop.app
Then launch the app again from Applications or Spotlight.
No models appear in the model selector
- Verify you have configured at least one provider with a valid API key on the Models page.
- If using Ollama, ensure it is running (
ollama serve) onlocalhost:11434. - If using Aura AI, click Scan to detect models and then click Launch to start the inference server.
- Check that the provider's API key has not expired or been revoked.
Task stuck in "Executing" state
- Click the Stop button to cancel the running task.
- Check the model configuration: the selected model may be unavailable or the API key may be invalid.
- If stop does not respond immediately, the task may be waiting on a long-running tool operation (browser actions time out at 45 seconds, team/workflow tasks at 30 minutes).
- Resume the task from the Dashboard after fixing the underlying issue.
Web UI not accessible from another machine
- Ensure the web server is enabled in Settings > Connectivity.
- Check that port 18800 (or your configured port) is not blocked by a firewall.
- If using authentication, include the Bearer token in your requests.
- Verify the server is running:
curl http://localhost:18800/api/healthshould return{"status":"ok"}. - Ensure the machine's network allows incoming connections on the configured port.
Context window errors or truncation
- Context compression triggers automatically at 70% of the model's context window. For very long tasks, the system compresses older messages into a summary.
- Try using a model with a larger context window (such as Gemini 2.0 with 1M tokens).
- Break complex tasks into smaller, focused sub-tasks to reduce context accumulation.
Docker mode not available
- Docker must be installed and running. Aura Workshop auto-detects Docker on startup.
- If Docker is not detected, the app defaults to native mode automatically.
- You can manually toggle native mode in Settings > General.
Local inference not starting
- Ensure you have downloaded at least one GGUF model via the HuggingFace Downloader.
- Check that the model file exists in
~/.cache/aura-inference/models/. - Verify the GPU layer count is appropriate for your hardware. If your GPU has insufficient VRAM, reduce the number of GPU layers.
- On macOS, Metal acceleration is used automatically. On Linux, CUDA requires an NVIDIA GPU with current drivers.
- Check that port 8080 (or your configured port) is not already in use by another application.
Inference cluster workers not discovered
- Ensure all machines are on the same LAN subnet.
- Verify UDP port 18801 is not blocked by any firewall on either the master or worker machines.
- Docker deployments must use
--net=hostfor UDP broadcast to function correctly. - Workers expire after 90 seconds without broadcasting. Check that the worker process is still running.
- Confirm that
lan_discovery_enabledis set to true in settings on both master and worker.
MCP server connection failures
- For stdio transport: verify the command path is correct, the binary exists, and it is executable.
- For HTTP transport: verify the URL is reachable with a curl test.
- Check MCP server logs for startup errors or version incompatibilities.
- Some MCP servers require specific Node.js or Python versions. Ensure the correct runtime is installed.
Agent blocked from running a command
- The safety guardrails block dangerous commands automatically. The error message explains why the command was blocked.
- The
elevatedBashsetting allows sudo commands but does not bypass the safety guardrails. - If you believe a command was blocked incorrectly, check whether it matches any of the blocked patterns (rm -rf /, fork bombs, mkfs, dd on block devices).
Provider rate limit errors (HTTP 429)
- Configure the provider fallback order in Settings > Billing so the system can automatically switch to another provider.
- Set spend limits to control costs and prevent unexpected charges.
- Some providers with free tiers (Groq, z.ai) have strict rate limits. Consider using them as secondary fallback providers rather than primary.
Database corruption or issues
- Use Settings > Data > Diagnostics to check database health and integrity.
- The database uses WAL mode, which is resilient to most crash scenarios.
- As a last resort, use Settings > Data > Reset Database to start fresh (this deletes all data permanently).
How do I use Aura Workshop without any internet connection?
Download one or more GGUF models via the HuggingFace Downloader while you have internet access. Then use the Aura AI local inference engine to run those models entirely on your hardware. No internet connection is required for local inference, and all data stays on your machine.
Listener not receiving messages from a platform
- Verify the authentication token is correct and has not expired.
- For Slack: ensure the bot has been invited to the channel it is monitoring.
- For Discord: verify the bot has the necessary permissions (Read Messages, Send Messages) in the target server and channel.
- For Telegram: confirm the bot token is valid using the Telegram Bot API (
https://api.telegram.org/bot<token>/getMe). - For WhatsApp: ensure you have completed the Meta Business verification and the WhatsApp Business API is properly configured.
- Check the listener's trigger rules: if set to "mentions" mode, messages without the bot mention will be ignored.
- View the listener's event logs for errors: go to Listeners > View Logs on the specific listener.
Webhook not triggering tasks
- Verify the webhook URL is correct and accessible from the sending service. Test with cURL.
- If using HMAC-SHA256 validation, ensure the sending service is computing the signature correctly against the shared secret.
- Check that webhooks are served on port 18790 (separate from the main web UI port 18800).
- View webhook invocation logs for errors: go to Webhooks > View Logs on the specific webhook.
- Ensure the webhook is enabled (the toggle switch should be on).
Schedule not running at the expected time
- Schedule times are in the local timezone of the machine running Aura Workshop.
- For Docker deployments, the container timezone may differ from your local timezone. Set the
TZenvironment variable. - Verify the schedule is enabled (toggle switch is on).
- Check that the duration type has not expired (for "repeat_until" schedules).
- If no model is configured for the schedule, it uses the globally selected model which must be available.
Voice input not working
- Ensure microphone permissions are granted to the application.
- On macOS, check System Settings > Privacy & Security > Microphone.
- Verify the speech-to-text provider is configured in Settings > Integrations.
- If using a cloud provider (Whisper, Groq, OpenAI), ensure the corresponding API key is set.
- The System provider uses the OS built-in speech recognition which requires no API key.
Smart routing not saving money as expected
- Review the tier boundaries in Settings > Routing. The default boundaries may not match your workload.
- Check the routing analytics to see which tiers are being used most.
- If most tasks are being routed to Complex or Reasoning tiers, consider adjusting the boundary scores to be more aggressive.
- Ensure you have configured cheaper models for the Simple and Standard tiers.
Memory facts not being extracted
- Memory extraction happens after task completion, not during execution.
- Very short conversations (single-turn Q&A) may not produce extractable facts.
- Check Settings > Memory > Learned Facts to see what has been extracted.
- Corrections and reinforcements require specific trigger phrases ("no", "wrong", "instead" for corrections; "yes", "perfect", "great" for reinforcements).
AURA.md not being detected
- The file must be named exactly
AURA.md,.aura.md,CLAUDE.md,.claude.md, or.aura/INSTRUCTIONS.md. - The file must be in the project root or up to 2 parent directories above it.
- File size is capped at 4 KB per file, 12 KB total. Larger files are truncated.
- Ensure the task has a project path set so the system knows where to look.
How do I update to the latest version?
Go to Settings > Updates and click "Check for Updates". If a new version is available, click "Download and Install". The application downloads the update, verifies its integrity, and restarts automatically. For Docker deployments, pull the latest image: docker pull coolkoo/aura-workshop:daemon-latest.
Can I use Aura Workshop as an API server for other applications?
Yes. The API Gateway feature exposes OpenAI-compatible and Anthropic-compatible endpoints at /v1/chat/completions, /v1/messages, and /v1/models. Point any SDK or tool at your Aura Workshop instance and it will proxy requests to your configured provider with automatic fallback, spend limits, and logging.
How do I back up my Aura Workshop data?
All application data is stored in a single SQLite database file. To create a backup:
- Locate the database file for your platform (see Data Storage Locations).
- Copy the
aura-workshop.dbfile to a safe location. Because the database uses WAL mode, also copyaura-workshop.db-walandaura-workshop.db-shmif they exist. - Optionally back up the
~/.aura/directory to preserve user memories, custom roles, and CLI configuration. - For Docker deployments, the
/datavolume mount contains the database.
How do I migrate Aura Workshop to a new machine?
- Install Aura Workshop on the new machine.
- Copy the database file from the old machine to the appropriate location on the new machine.
- Copy the
~/.aura/directory for memories and roles. - Copy the
~/.cache/aura-inference/models/directory if you want to keep downloaded GGUF models. - Launch Aura Workshop on the new machine. It will read the existing database and restore your configuration.
- Note: credential encryption keys are stored in the OS keychain and cannot be migrated. You will need to re-enter API keys and credentials on the new machine.
Can multiple users share one Aura Workshop instance?
Yes. The web UI can be accessed by multiple users simultaneously. Each user creates their own tasks, and all tasks are visible to all users. For multi-user environments, configure a Bearer token for authentication. Note that there is no per-user access control in the current version; all authenticated users have full access to all features and data.
What happens when the context window fills up?
When the conversation context reaches 70% of the model's maximum context window, automatic compression kicks in. The system summarizes older messages into a condensed form, preserving the most important information while freeing up context space. This process is transparent and the task continues without interruption. The context usage percentage is visible in the Context Panel on the right sidebar.
How do parallel agents handle file conflicts?
When multiple parallel agents modify the same file, the merge executor resolves conflicts using the configured strategy: CopyOnWrite (creates side-by-side copies, safest option), LastWins (last agent's version prevails), or LLMResolve (an LLM intelligently merges the changes). The merge also handles dependency files (package.json, Cargo.toml, requirements.txt) with smart dependency merging that combines packages from all agents.
Supported Providers Summary
| Provider | API Format | Auth Type | Free Tier |
|---|---|---|---|
| Anthropic | Anthropic native | API key header | No |
| OpenAI | OpenAI | Bearer token | No |
| Google Gemini native | Query parameter | Yes (Flash, Flash-Lite) | |
| MiniMax | OpenAI-compatible | Bearer token | No |
| DeepSeek | OpenAI-compatible | Bearer token | No |
| Mistral AI | OpenAI-compatible | Bearer token | No |
| Zhipu AI (z.ai) | OpenAI-compatible | Bearer token | Yes |
| Xiaomi / MiMo | OpenAI-compatible | Bearer token | No |
| Moonshot / Kimi | OpenAI-compatible | Bearer token | No |
| OpenRouter | OpenAI-compatible | Bearer token | Yes (many free models) |
| Together AI | OpenAI-compatible | Bearer token | No |
| Groq | OpenAI-compatible | Bearer token | Yes (rate-limited) |
| SiliconFlow | OpenAI-compatible | Bearer token | No |
| Ollama | OpenAI-compatible | None | Free (local) |
| LM Studio | OpenAI-compatible | None | Free (local) |
| LocalAI | OpenAI-compatible | None | Free (local) |
| vLLM | OpenAI-compatible | None | Free (local) |
| TGI | OpenAI-compatible | None | Free (local) |
| SGLang | OpenAI-compatible | None | Free (local) |
| Aura AI | OpenAI-compatible | None | Free (bundled) |