Aura Workshop / Documentation

Aura Workshop v1.30.1 Documentation

Welcome to the complete reference for Aura Workshop, a model-agnostic AI agent orchestration platform built for individuals and teams who want full control over their AI workflows. This documentation covers every feature from first install through advanced multi-agent teams, 31-platform automation, distributed GPU inference clusters, visual DAG workflows, remote agent deployment, the full REST API with approximately 140 endpoints, 50+ agent tools, 100+ skills, and the command-line interface. Whether you are running a quick question-and-answer session or orchestrating a fleet of agents across multiple machines, this guide has you covered.

Installation

System Requirements

Before installing, confirm your system meets the minimum requirements for your platform.

PlatformMinimum Requirements
macOSmacOS 12 (Monterey) or later. Xcode Command Line Tools are recommended for full functionality. Apple Silicon (M-series) provides Metal GPU acceleration for local inference.
WindowsWindows 10 64-bit or later. An NVIDIA GPU with current drivers is recommended for local inference with CUDA support.
LinuxA modern 64-bit distribution with webkit2gtk 4.1 and libayatana-appindicator (or their equivalents). NVIDIA drivers and CUDA toolkit are recommended for GPU inference.

Install from Package

macOS

Download the .dmg installer from the Downloads page. Open the DMG file and drag the Aura Workshop icon into your Applications folder. Because the app is distributed outside the Mac App Store, macOS may quarantine it on first launch. If you see a security warning, open Terminal and run the following command to clear the quarantine flag:

xattr -cr /Applications/Aura\ Workshop.app

After running this command, launch the app normally from Applications or Spotlight.

Windows

Download the NSIS installer .exe from the Downloads page. Run the installer and follow the standard installation wizard prompts. Choose your installation directory (the default is C:\Program Files\Aura Workshop), click Install, and wait for the process to complete. Launch Aura Workshop from the Start Menu or desktop shortcut.

Linux

Two package formats are available:

Docker

For headless server deployments, pull the pre-built Docker image from Docker Hub:

docker pull coolkoo/aura-workshop:daemon-latest

See the Docker Daemon section for full setup instructions including environment variables, volume mounts, and GPU passthrough.

Build from Source

Building from source requires Node.js 18 or later and a Rust toolchain installed via rustup. You also need the platform-specific Tauri dependencies documented in the Tauri v2 prerequisites guide.

# Clone the repository
git clone https://github.com/coolkoo/aura-workshop.git
cd aura-workshop

# Install JavaScript dependencies
npm install

# Build the desktop application
npm run tauri build

You can target a specific package format by passing a bundles flag:

npm run tauri build -- --bundles dmg        # macOS DMG disk image
npm run tauri build -- --bundles nsis       # Windows NSIS installer
npm run tauri build -- --bundles deb        # Linux .deb package
npm run tauri build -- --bundles appimage   # Linux AppImage

The compiled binary and installer will appear in src-tauri/target/release/bundle/.

First Launch & Setup Wizard

The very first time you open Aura Workshop, a five-step setup wizard guides you through initial configuration. Each step can be skipped and revisited later from the Settings page.

Step 1: Welcome

An introductory screen that explains the core capabilities of the platform. Click Next to proceed.

Step 2: License Key Entry (Optional)

If you have purchased a license, enter the key here to unlock Professional, Business, or Enterprise features. If you do not have a key, click Skip to continue in Community mode. Community mode provides full access to single-agent tasks, basic automation, local inference, and core tools. You can enter or upgrade your license at any time from Settings.

Step 3: Provider Configuration

Select your preferred AI provider from a list that includes Anthropic, OpenAI, Google, and many more. Enter your API key for the chosen provider. The wizard includes a Test Connection button that verifies the key is valid and the provider is reachable before saving. You can configure additional providers later from the Models page.

Step 4: Local Model Download (Optional)

If you want to run models locally without any cloud dependency, this step offers a curated list of GGUF models you can download directly. The models range from compact 5 GB options to large 42 GB models for maximum quality. Downloads show real-time progress, speed, and estimated time remaining. Skip this step if you plan to use only cloud providers.

Step 5: Ready

Setup is complete. You are taken directly to the main Dashboard where you can begin your first task.

Behind the Scenes on First Launch

In addition to the wizard, the application performs several initialization steps automatically:

Database Locations

PlatformPath
macOS~/Library/Application Support/aura-workshop/aura-workshop.db
Windows%APPDATA%\aura-workshop\aura-workshop.db
Linux~/.local/share/aura-workshop/aura-workshop.db
Docker/data/aura-workshop.db (via volume mount)

Application Layout

The Aura Workshop interface centers on a left sidebar with an icon rail for primary navigation, a scrollable middle section for project and task organization, and a main content area that fills the rest of the screen.

Left Sidebar Icon Rail

The icon rail runs vertically along the left edge and is always visible. The top section contains fixed navigation icons:

IconLabelView
GridDashboardHome screen with task history, quick actions, and notifications
RobotAgentsPrimary task workspace for all AI interactions
HeadsetListenersMessaging platform listeners and automation triggers
LinkWebhooksHTTP webhook endpoint management
ClockSchedulersScheduled task definitions and controls

Scrollable Middle Section

Below the primary navigation icons, the sidebar has two collapsible sections:

Bottom Fixed Section

Three icons are pinned to the bottom of the sidebar and are always accessible regardless of scroll position:

Sidebar Behavior

The sidebar is collapsible. Click the collapse button or drag the resize handle at the sidebar's right edge to adjust its width. On mobile and narrow screens, the sidebar collapses into a hamburger menu accessible from the top-left corner. Tapping the hamburger icon reveals the full sidebar as a slide-over overlay.

Dashboard Overview

The Dashboard is the home screen that greets you each time you open Aura Workshop. It provides a high-level view of your activity and offers quick pathways to start new tasks.

Greeting

At the top of the Dashboard, a randomized greeting message appears (for example, "Good afternoon" or "Welcome back"). This text rotates on each visit to keep the experience fresh.

Dependencies Banner

If the system detects that required dependencies are missing (such as Docker, Node.js, or Python), a prominent banner appears at the top of the Dashboard with a description of what is missing and a direct link to Settings where you can resolve the issue. This banner disappears once all dependencies are satisfied.

Quick Action Cards

Below the greeting, the Dashboard presents a grid of Quick Action Cards organized into eight categories. Each card contains a brief description and a one-click prompt that creates a new task pre-filled with a relevant starting instruction.

CategoryExample Prompts
Software DevBuild a REST API, set up a CI/CD pipeline, create a system design document, analyze a dataset
MarketingDraft a content calendar, create an email campaign, run a competitive analysis
FinanceBuild a financial model, create a P&L analysis, design a budget framework
LegalReview a contract, draft an NDA, generate a compliance checklist
ResearchConduct a literature review, design an A/B test, analyze survey responses
DesignCreate a design system, perform an accessibility audit, write user stories
OperationsWrite a runbook, design an onboarding workflow, create an SLA framework
EducationBuild a course curriculum, write a tutorial, design a grading rubric

Clicking any card immediately creates a new task and begins execution with the selected prompt.

Recent Tasks

The main body of the Dashboard displays a grid of recent tasks. Each task card shows:

Completed Task Notifications

When a task finishes while you are viewing the Dashboard or another screen, a toast notification slides in from the corner. The notification shows the task title and its final status. These notifications auto-dismiss after 8 seconds, or you can click them to navigate directly to the completed task.

Task Composer

At the bottom of the Dashboard (and indeed at the bottom of every screen), a persistent task composer bar is always available. This is the single entry point for all interactions with the AI. The composer includes the following controls:

Starting a Task

There is no separate "Chat Mode" or "Agent Mode" in Aura Workshop. Everything flows through the single unified input described above. To start a task:

  1. Type your request in the input textarea at the bottom of the screen.
  2. Optionally click Files to attach one or more files for context.
  3. Optionally click Folder to mount a project directory as the agent's working path.
  4. Optionally click Tools to open the MCP tool picker and enable or disable specific tools. Tools are grouped by their MCP server name, and each shows the tool name plus a brief description.
  5. Optionally click Voice to record a spoken prompt. A pulsing red dot indicates active recording.
  6. Optionally toggle Plan/Execute (clipboard icon, cyan when Plan is on) to have the agent create an execution plan before acting.
  7. Optionally select a Role from the role picker. The picker includes search, category filtering, and a "Default (no role)" option.
  8. Optionally set Thinking mode (Off / Low / Medium / High) when the selected model supports it.
  9. Optionally choose a different model from the model selector dropdown.
  10. Press Enter or click the Send button (paper plane icon).

The system automatically classifies your input and routes it to the appropriate execution path. You never need to manually select a mode.

Task Header Bar

When you are inside an active or completed task, a header bar appears at the top of the content area. It contains the following elements:

Message Thread

The main area of the task workspace is a scrollable message thread that displays the entire conversation between you and the agent.

User Messages

Your messages appear as right-aligned bubbles. Any file or folder attachments you included are displayed as chips below the message text. File chips show the file name with a small icon, and folder chips show the directory path. Both types have an X button to remove them from context for subsequent messages.

Assistant Messages

Agent responses appear as left-aligned content blocks. During streaming, text arrives progressively with an activity indicator showing the agent's current operation. Responses are rendered as rich Markdown with syntax-highlighted code blocks, LaTeX math (via KaTeX), Mermaid diagrams, tables, and all standard formatting.

Thinking Blocks

When the model uses extended reasoning (thinking mode), thinking blocks appear as collapsible sections prefixed with a thought-bubble indicator. Click to expand and view the agent's internal reasoning chain. By default, thinking blocks are collapsed to reduce visual noise.

Tool Execution Cards

Each tool invocation produces a collapsible card in the message thread. The card header shows:

Expanding the card reveals the full input parameters the agent sent to the tool and the complete output the tool returned. For long outputs, the content is scrollable within the card.

Role Dividers

In multi-agent team tasks, a horizontal divider line appears when execution passes from one role to the next. The divider shows the role name and includes a + Memory button that lets you save the agent's work from that role as a persistent memory entry.

Routing Info Badges

When smart routing is active, small badges appear on messages indicating the routing decision, such as "via claude-sonnet-4-20250514 tier" or "routed: Standard". This helps you understand which model was selected for each turn.

Error Messages

Errors from tool execution, API failures, or provider issues appear as red-text blocks within the thread. Error messages include the error type and a description to help you diagnose and resolve the issue.

Load Earlier Button

For tasks with very long histories that exceed the display window, a "Load Earlier" button appears at the top of the thread. Click it to fetch and display older messages from the database.

Activity Indicator

While a task is executing, a fixed bar appears just above the input area. This activity indicator shows:

The activity indicator updates in real-time as the agent moves between operations.

Follow-up Suggestions

After a task completes, a collapsible suggestion bar appears below the last message. It features:

The suggestions are generated automatically based on the conversation context and the agent's output.

Response Actions

When a task reaches a terminal state (completed, failed, or interrupted), several response actions become available:

Scroll to Bottom Button

When you scroll up in a long message thread, a circular down-arrow button appears in the bottom-right corner. Click it to instantly scroll to the most recent message.

Files Created Button

When the agent creates files during task execution, a "Files Created" button appears in the task header area. It displays a count badge (for example, "(7)") showing how many files were produced. Clicking it opens the FileBrowser overlay, a file explorer view that lists all generated files with options to view, download, or open them.

Right Sidebar / Context Panel

The Context Panel is a collapsible right sidebar that provides detailed information about the current task. Toggle it with the hamburger icon in the task header. The panel can be resized by dragging its left edge.

Workflow Progress

For multi-agent team tasks, the Context Panel shows a "Workflow" header with a progress badge (for example, "3/6"). Below it, a vertical stepper displays each agent or role in the workflow with status dots:

Roles running on remote deployments show a "Remote" badge next to their name.

Parallel Agents

When three or more agents are running in parallel (via fan-out), the Context Panel shows a dedicated Parallel Agents section. It displays a count such as "5/8 completed" and lists a card for each agent with its index number, current status, and the number of tool calls it has made.

Thinking Content

When the model is using extended reasoning, the Context Panel includes a collapsible Thinking Content section that shows the raw reasoning text. A maximize/minimize toggle lets you expand it to fill the panel for easier reading.

Usage Stats

The bottom of the Context Panel displays real-time usage statistics for the current task:

MetricDescription
Context UsageA color-coded progress bar showing what percentage of the model's context window has been consumed. The bar transitions from green to yellow at 50% and to red at 80% or above.
Input TokensThe number of input tokens sent to the model, formatted with K/M suffixes for readability (for example, "12.3K").
Output TokensThe number of output tokens generated by the model.
Cache ReadThe number of tokens served from the provider's prompt cache, if available.
Total CostThe cumulative dollar cost for this task, calculated using the model pricing table.
LatencyThe time-to-first-token latency in milliseconds for the most recent API call.
ModelThe name of the model being used for this task.
ProviderThe name of the provider serving the model.

Visualizations in the Message Thread

The message thread renders rich content beyond plain text:

Mermaid Diagrams

The agent can output Mermaid code blocks that render as interactive SVG diagrams directly in the chat. Supported diagram types include flowcharts, sequence diagrams, class diagrams, pie charts, timelines, Gantt charts, ER diagrams, state diagrams, and journey maps. Each diagram includes a "Save as PNG" button for export.

Charts

The chart_generate tool produces PNG chart images that are embedded in the message thread. Six chart types are supported: bar (vertical), line, pie, scatter, histogram, and area. All charts use a dark theme that matches the application UI.

Code Blocks

Code blocks in responses are syntax-highlighted using language detection. Each code block has a copy button in its top-right corner. Supported languages include Python, JavaScript, TypeScript, Rust, Go, Java, C++, HTML, CSS, SQL, YAML, JSON, Bash, and many more.

Math

Mathematical expressions are rendered using KaTeX. Inline math uses $...$ delimiters and display math uses $$...$$ blocks.

Tables

Markdown tables are rendered as styled HTML tables with alternating row colors. Each table includes a "Copy as CSV" button for easy export to spreadsheets.

Exporting Conversations

Click Export in the task header bar on any completed task. The conversation opens as a styled HTML page in a new window. Use your browser's Print > Save as PDF to create a shareable document. The exported page includes all text, code blocks, tool call summaries, and images.

Running Multiple Tasks Simultaneously

Multiple tasks can run at the same time, each potentially using a different model. Each task gets its own independent cancel token and event stream. The model that was active when a task was launched is captured for that task's duration, so switching models does not affect already-running tasks. Monitor all running tasks from the Dashboard where their status updates in real-time.

Project Management

Projects let you organize related tasks into groups, scope memory to specific codebases, and keep your workspace tidy.

Creating a Project

Click the + button next to the "Projects" header in the left sidebar. A dialog appears where you enter a project name and optionally select a directory path on disk. The directory path enables project-scoped memory and AURA.md discovery.

Associating Tasks with Projects

Drag and drop any task from the task list onto a project name in the sidebar to associate it. You can also select a project from the project selector in the task composer before creating a new task.

Task Count Badges

Each project in the sidebar displays a badge with the count of tasks associated with it. This count updates in real-time as tasks are created, completed, or deleted.

Project Actions

Multi-Select and Bulk Actions

Hold Ctrl (or Cmd on macOS) and click to select multiple tasks. Checkboxes appear for each task when in multi-select mode. A bulk action bar appears at the top of the task list with options to move selected tasks to a project, delete them, or resume interrupted tasks in batch.

Task Classification & Smart Routing

Every message you send is automatically analyzed by a fast one-shot LLM call that classifies it into a category. This classification happens transparently and requires no manual mode switching.

Classification Categories

CategoryWhat Happens
Single Agent TaskA single agent runs with access to all configured tools. This is the default for most requests that one agent can handle end-to-end.
Team TaskThe system routes to a multi-agent team. If a matching team already exists, it is selected automatically. If not, the system creates a new team with appropriate roles on the fly.
Scheduled TaskIf the message describes a recurring task (for example, "every Monday at 9am"), the system creates a schedule automatically.
Clarification NeededThe agent asks follow-up questions to gather more information before proceeding.

Smart Routing

When smart routing is enabled (in Settings), the classifier also evaluates the complexity of your request and routes it to the most cost-effective model tier that can handle it. This saves money on simple requests while preserving quality for complex ones. The routing tiers are configurable (see Smart Routing under Models).

Multi-Agent Teams

Teams let you assign complex tasks to a group of specialized agents that work together, each contributing their domain expertise. Teams are configured in Settings, and tasks are routed to teams automatically by the classifier or manually by selecting a team.

Team Configuration

Go to Settings > Teams to manage teams. The team form includes:

Team Task UI

When a team task runs, the message thread shows role dividers between each agent's section. The Context Panel displays the workflow progress stepper showing which role is active. Role handoffs are visible in the thread: each role calls the role_complete tool to pass structured handoff data (summary, files created, key decisions, constraints) to the next role.

Built-in Roles

Aura Workshop ships with 20 built-in roles organized into six categories. Each role has a pre-written system prompt and a curated set of tool permissions.

Role Library (20 Built-in Roles)

CategoryRoles
SoftwareProduct Manager, Architect, Developer, QA Engineer, DevOps
ContentResearch Lead, Writer, Editor
BusinessBusiness Analyst, Marketing Strategist, Sales Copywriter
DataData Engineer, Data Analyst, Data Scientist
DesignUX Designer, UI Designer
OperationsProject Coordinator, Technical Writer, Security Auditor, Code Reviewer

Per-Role Model Selection

Each role in a team can be assigned a specific model. This allows you to use a smaller, cheaper model for routine roles (such as initial research) while reserving a more powerful model for critical roles (such as architecture decisions). If no per-role model is set, the team uses the globally selected model.

Custom Roles

Create, edit, duplicate, or delete roles from Settings > Roles & Prompts. Each role has a name, description, system prompt, and a set of allowed tools (toggled via checkboxes). Roles are saved as markdown files with YAML frontmatter in ~/.aura/roles/.

Fan-Out (Parallel Agents)

When the classifier detects that a task contains independent sub-tasks, it uses fan-out to run multiple agents in parallel. For example, "Compare the top 5 vector databases" spawns five parallel research agents, one per database, then merges their results into a unified comparison.

Team TypeSource Role ProducesFan-Out Role Does
Software DevArchitect lists implementation tasksOne developer per task
Content WritingResearch Lead lists sectionsOne writer per section
ResearchLead lists research questionsOne researcher per question
TranslationManager lists target languagesOne translator per language

Parallel results are merged using configurable strategies: CopyOnWrite (safe, no data loss), LastWins (last agent's version prevails on conflict), or LLMResolve (an LLM intelligently combines conflicting changes).

Agent Tools Reference

Every agent has access to a rich set of built-in tools. The available tools depend on role configuration, execution mode, and any connected MCP servers. The full tool registry is defined in tools/mod.rs.

Core Tools (Always Available)

ToolDescription
read_fileRead the contents of a file at a given path. Returns the full text of the file. Supports all text-based formats and can handle binary files by returning base64-encoded content.
write_fileCreate a new file or overwrite an existing file with the specified content. Automatically creates parent directories if they do not exist.
edit_fileMake targeted edits to an existing file using find-and-replace operations. Supports multiple replacements in a single call, making it efficient for surgical modifications without rewriting entire files.
bashExecute shell commands on the host operating system. Uses sh -c on macOS and Linux or cmd /C on Windows. Supports setting the working directory, timeout, and environment variables. Subject to safety guardrails that block destructive commands.
globFind files matching a glob pattern such as **/*.ts or src/**/*.py. Returns a list of matching file paths, useful for discovering project structure.
grepSearch file contents using regular expressions. Returns matching lines with file paths and line numbers, similar to the grep command but with structured output for agent consumption.
list_dirList the contents of a directory, returning file names, sizes, types (file or directory), and modification times.
web_fetchFetch content from a URL via HTTP GET. Returns the response body as text. Useful for reading web pages, downloading data, and interacting with web APIs.
web_searchSearch the web using the configured search provider (DuckDuckGo by default, or Google, Brave, Serper, Bing). Returns text results and image URLs.
email_sendSend an email using the configured email method (system default or SMTP). Supports HTML content, subject line, recipients (to, cc, bcc), and attachments.
chart_generateGenerate data visualizations as PNG images. Supports bar, line, pie, scatter, histogram, and area chart types with a dark theme. Ideal for creating reports and visual summaries.
generate_imageGenerate images using AI image generation APIs. The agent provides a text prompt and receives a generated image.
generate_videoGenerate video content using AI video generation APIs. Supports various durations and styles depending on the configured provider.
generate_musicGenerate audio and music using AI music generation APIs. Supports various genres, moods, and durations.
run_chainExecute a chain script, which is a predefined sequence of tool calls defined in a skill file. This meta-tool enables complex multi-step workflows to be packaged as reusable recipes.

Docker Tools

ToolDescription
docker_runRun a command inside a Docker container. Supports image selection, volume mounts, port mapping, and environment variables. Useful for running tasks in isolated environments.
docker_listList all running Docker containers on the host, showing container ID, image, status, and port mappings.
docker_imagesList all Docker images available on the host, showing repository, tag, image ID, and size.

Platform Tools

ToolDescription
system_notifySend a native desktop notification with a title and message body. Works on macOS, Windows, and Linux.
screen_captureCapture a screenshot of the current display. On macOS, uses the built-in screencapture utility. Returns the image as a file path or base64 data.
camera_captureCapture an image from the device's webcam. Currently supported on macOS. Returns the captured image for analysis or inclusion in responses.
create_listenerProgrammatically create a new event listener for any of the 31 supported messaging platforms. The agent can set up real-time monitoring without manual configuration.
create_scheduleProgrammatically create a new scheduled task with a specified time, frequency, and prompt.
create_webhookProgrammatically create a new webhook endpoint that triggers agent tasks on incoming HTTP requests.
send_slackSend a message to a Slack channel or user. Requires a Slack bot token.
send_emailSend an email message (distinct from email_send in that this tool is specifically for platform-level messaging).
ocrExtract text from images using optical character recognition. Combines a vision model with text extraction for high-accuracy results on documents, screenshots, and photos.
merge_pdfsMerge multiple PDF files into a single document. Accepts a list of file paths and produces a combined PDF.
security_scanRun a security vulnerability scan on code or configurations. Reports potential issues with severity levels and remediation suggestions.
role_completeSignal that the current role has finished its work in a multi-agent team workflow. Includes structured handoff data: summary, files created, key decisions, and constraints for the next role. This tool is only available within team workflow contexts.

MCP Tools

Any tools provided by connected MCP servers are automatically available to agents. MCP tools are named using the pattern mcp_{server_id}_{tool_name}, making them easy to identify. MCP tool results are automatically truncated to 8000 characters to prevent context overflow. MCP tools appear in the Tools picker in the task composer, grouped by their server name.

Command Safety

The agent middleware includes non-bypassable safety guardrails that block dangerous shell commands. These guardrails are always active regardless of settings.

Blocked Patterns

Command Timeout

Shell commands have a default timeout of 60 seconds, with a maximum configurable timeout of 300 seconds. Commands exceeding the timeout are terminated automatically.

Elevated Bash

The elevatedBash setting in General Settings allows agent commands to use sudo. This does not bypass the safety guardrails; dangerous patterns are still blocked even with elevated privileges.

Concurrency

The max_concurrent_tools setting (default: 4) controls how many tools can execute simultaneously within a single task. This prevents resource exhaustion when the agent attempts many parallel tool calls.

Skills Overview

Skills are specialized instruction sets and automation recipes that extend the agent's capabilities. Aura Workshop ships with a comprehensive skills library organized into multiple categories. Skills are managed in Settings > Skills, where you can view the list of installed skills (each showing name, category, and a prompt preview), edit existing skills, delete skills, or create new ones with the New Skill button.

Document Skills

Core document generation skills are bundled with every installation. These enable the agent to create professional documents programmatically:

SkillDescription
pdfCreate PDF reports with headers, paragraphs, tables, charts, images, and custom styling.
docxCreate Microsoft Word documents using python-docx with full formatting, styles, headers, footers, and tables.
xlsxCreate Excel spreadsheets using openpyxl with formulas, charts, conditional formatting, and multiple sheets.
pptxCreate PowerPoint presentations using python-pptx with slide layouts, charts, images, and transitions.

Anthropic Skills (15)

Advanced skills covering design, development, content creation, and tooling:

SkillDescription
pdfAdvanced PDF generation with complex layouts
docxAdvanced Word document workflows
xlsxAdvanced spreadsheet operations
pptxAdvanced presentation creation
algorithmic-artGenerate algorithmic and generative art using code
brand-guidelinesCreate comprehensive brand guideline documents with color palettes, typography, and usage rules
canvas-designDesign interactive canvas-based visualizations
doc-coauthoringCollaborative document co-authoring workflows
frontend-designDesign and build frontend interfaces with modern frameworks
internal-commsDraft internal communications, memos, and announcements
mcp-builderBuild custom MCP servers from scratch
skill-creatorCreate new skills with proper structure and metadata
slack-gif-creatorCreate animated GIFs optimized for Slack
theme-factoryDesign and generate UI themes with consistent design tokens
web-artifacts-builderBuild interactive web artifacts (mini-apps, widgets, demos)
webapp-testingTest web applications end-to-end with automated test suites

Superpowers (14)

Meta-skills that enhance how the agent approaches complex tasks. These do not perform actions themselves but guide the agent's reasoning and workflow strategy:

SkillDescription
brainstormingStructured brainstorming and ideation frameworks
dispatching-parallel-agentsCoordinate multiple agents working in parallel on independent sub-tasks
executing-plansExecute multi-step plans methodically, tracking progress and adapting to issues
finishing-a-development-branchComplete and polish a feature branch: tests, linting, commit messages, PR preparation
receiving-code-reviewProcess and apply code review feedback systematically
requesting-code-reviewPrepare and submit code for review with clear context and description
subagent-driven-developmentBreak complex tasks into sub-agent work units for parallel execution
systematic-debuggingMethodical debugging with hypothesis generation, testing, and root cause analysis
test-driven-developmentWrite tests first, then implement code to pass those tests
using-git-worktreesManage parallel git worktrees for concurrent feature development
using-superpowersMeta-skill for combining multiple superpowers in a single workflow
verification-before-completionVerify all work meets requirements before marking a task as complete
writing-plansCreate detailed, structured execution plans before beginning work
writing-skillsAuthor new skill definitions with proper structure and documentation

Desktop App Skills (20)

Using the Accessibility API on macOS, the agent can interact with native desktop applications directly. Each application has a dedicated skill covering its specific UI elements, menus, and workflows:

ApplicationCapabilities
ExcelCreate/edit spreadsheets, formulas, formatting, charts
WordCreate/edit documents, styles, tables, images
PowerPointCreate/edit presentations, slides, animations
ChromeNavigate pages, interact with web content, capture screenshots
FinderNavigate folders, manage files, organize documents
OutlookCompose/read emails, manage calendar, contacts
NumbersCreate/edit Apple Numbers spreadsheets
PagesCreate/edit Apple Pages documents
KeynoteCreate/edit Apple Keynote presentations
SlackSend messages, navigate channels, manage workspace
TeamsSend messages, join meetings, manage teams
ZoomStart/join meetings, manage settings
MailCompose/read emails in Apple Mail
NotionCreate/edit pages, databases, and blocks
TerminalExecute commands, manage terminal sessions
VS CodeEdit files, navigate projects, run tasks
FigmaCreate/edit designs, manage components
AcrobatView/edit PDFs, annotations, form filling
SalesforceNavigate records, create/edit objects, run reports
SAPNavigate transactions, enter data, run reports

Design Skills (5)

SkillDescription
taste-skillEvaluate and refine visual design quality and taste
impeccableProduce pixel-perfect, polished design output
ui-ux-pro-maxAdvanced UI/UX design guidance and best practices
design-auditAudit designs for consistency, accessibility, and best practices
typographyTypography selection, pairing, and hierarchy guidance

Platform Skills

SkillDescription
credential-storeManage encrypted credentials programmatically within agent workflows
document-analyzerAnalyze documents and extract structured information
browser-automationAutomate browser interactions via Playwright for testing and scraping
browser-actDirect browser action control for real-time web interaction
browser-act-skill-forgeCreate new browser automation skills from recorded actions
prdGenerate comprehensive product requirements documents
orchestrationMulti-agent orchestration patterns and coordination strategies
githubGitHub repository operations: PRs, issues, reviews, workflows
weasyprintGenerate high-fidelity PDFs from HTML using the WeasyPrint engine
excalidrawCreate Excalidraw diagrams and sketches programmatically
pandocDocument format conversion between Markdown, HTML, DOCX, PDF, and more

Media and Research Skills (Printing Press Collection)

Over 100 API tools for media generation, research, and data enrichment are available through the Printing Press skill collection. These cover a wide range of capabilities:

CategoryExamples
Image GenerationDALL-E, Stable Diffusion, Midjourney-compatible APIs, flux
Video CreationVideo generation from text prompts, video editing APIs
Music SynthesisAI music generation, sound effects, audio processing
Web ScrapingStructured web data extraction, site crawling
Data AnalysisStatistical analysis tools, data transformation, visualization
Social MediaTwitter/X, LinkedIn, Instagram APIs for posting and analytics
Financial DataStock prices, market data, financial news, crypto markets
WeatherCurrent weather, forecasts, historical weather data
TranslationMulti-language translation APIs
CommunicationSMS, push notifications, messaging APIs

Role Skills

Role skills are specialized instruction sets that can be attached to specific roles. They extend a role's capabilities with domain-specific knowledge and behavioral guidelines. Managed via Settings > Skills, the REST API (/api/role-skills), or the agent's own tools (platform_create_skill, platform_list_skills). Each role skill contains:

Creating Custom Skills

You can create new skills in two ways:

  1. Through the UI: Go to Settings > Skills, click "New Skill", fill in the name, category, description, and prompt template.
  2. Through the skill-creator skill: Ask the agent to "create a skill for [your use case]" and it will use the skill-creator meta-skill to generate a properly structured skill definition.

Skills can include chain scripts, which are sequences of tool calls that run in order. The run_chain tool executes these scripts, enabling complex multi-step workflows to be packaged as reusable one-click recipes.

MCP Servers

Model Context Protocol (MCP) servers extend the agent's tool capabilities by connecting to external services and data sources.

Transport Types

TransportDescription
stdioLaunches a local process and communicates via stdin/stdout. Used for most MCP servers that run as command-line programs (Node.js, Python, etc.).
HTTPConnects to a remote MCP server via HTTP with Server-Sent Events for streaming. Used for hosted or cloud-based MCP services.

Configuration (Settings > MCPs)

  1. Go to Settings > MCPs.
  2. Click Add MCP Server.
  3. Enter a descriptive name for the server.
  4. Select the transport type (stdio or HTTP).
  5. For stdio: enter the command to launch the server (for example, npx or python) and its arguments. You can also set environment variables that will be passed to the child process.
  6. For HTTP: enter the server URL. Optionally configure OAuth credentials (client_id and client_secret) or custom headers (Bearer token, API key, or other authentication headers).
  7. Select the isolation mode:
    • shared: A single connection is used across all tasks. This is efficient but means state persists between tasks.
    • per_task: Each task gets its own MCP server instance, providing complete state isolation between tasks.
  8. Click Connect to establish the connection. A status indicator shows whether the server is connected (green) or disconnected (red).

Tool Definition Caching

When an MCP server connects, Aura Workshop caches its tool definitions so that they appear instantly in the tool picker without re-querying the server on each task.

Import from JSON

You can import MCP server configurations from a JSON file. This is useful for sharing configurations across team members or machines. The import format follows the standard MCP configuration schema.

Auto-Seeded MCP Servers

Aura Workshop automatically configures certain MCP servers based on your installed tools and API keys:

ServerTriggerDescription
Playwright BrowserAlways availableBrowser automation: navigate, click, type, screenshot, extract elements
Z.AI Visionz.ai API key configured8 vision tools for image analysis, OCR, and visual understanding
Z.AI Zreadz.ai API key configuredRead and extract content from GitHub repos and documentation
MiniMax Coding PlanMiniMax API key configuredWeb search and image understanding for structured coding plans

Schedules

Schedules let you define tasks that run automatically at specified times. Access the Schedules tab from the Schedulers icon in the left sidebar navigation.

Schedule List

The list view shows all configured schedules. Each row displays:

Create / Edit Schedule Form

Listeners (31 Platforms)

Listeners monitor external messaging platforms and trigger agent tasks based on incoming messages and events. Access the Listeners tab from the Listeners icon in the left sidebar.

Listener List

Each row in the list shows:

Supported Platforms (31)

CategoryPlatforms
MessagingWhatsApp, Telegram, Discord, Slack, Signal, Matrix, IRC, XMPP, Microsoft Teams, LINE, Facebook Messenger, WeChat, iMessage (macOS), Google Chat, Feishu/Lark
EmailEmail (IMAP/SMTP), Gmail (API)
SocialTwitter/X, Mastodon, Bluesky, Reddit, Twitch, Nostr, Zalo
CollaborationZulip, Rocket.Chat, Mattermost, Nextcloud Talk, Synology Chat
Built-inWebchat, Chatbot widget

Create / Edit Listener Form

Listener Detail View

Clicking "View Logs" on a listener opens the detail view, which shows:

Rules Engine

Each listener has a configurable rules engine that filters incoming messages:

Webhooks

Webhooks let you receive HTTP requests from external services and trigger agent tasks. Access the Webhooks tab from the Webhooks icon in the sidebar.

Webhook List

Each row shows:

Create / Edit Webhook Form

Webhook Detail View

Shows the auto-generated webhook URL with a copy button, a table of recent invocation logs (timestamp, request body, response status), and a cURL example for testing.

Slash Commands

Custom slash commands let you define shortcut commands that trigger specific agent behaviors.

Slash Command List

Each row shows the command name (with / prefix), description, handler type, enabled/disabled toggle, and Edit/Delete buttons.

Create / Edit Slash Command Form

Workflows (DAG-Based Automation)

Workflows provide a visual, directed acyclic graph (DAG) based automation system for building complex multi-step pipelines. Available in Business and Enterprise tiers.

Visual Workflow Editor

The workflow editor is a canvas-based interface where you create and connect nodes to define execution flow. You can:

11 Node Types

Node TypeDescription
agent_taskRun a single agent with a prompt and full tool access. Configure the prompt, model, role, and project path.
conditionalBranch execution based on conditions evaluated against data from previous nodes. Supports equality, comparison, and pattern matching.
delayPause execution for a specified duration (seconds, minutes, hours).
fan_outSplit work across multiple parallel agents. Define the split criteria and the number of parallel branches.
mergeCombine results from parallel branches. Configurable merge strategies: CopyOnWrite, LastWins, LLMResolve.
human_loopPause execution and wait for human approval. Displays a prompt and approve/reject buttons in the UI.
scriptExecute a shell script or command. Useful for build steps, deployments, or data transformations.
teamRun a multi-agent team as a single workflow step. Select the team and provide the task prompt.
transformTransform data between nodes using expressions. Map, filter, or reshape data flowing through the workflow.
validateValidate data against a schema or set of rules before allowing execution to continue.
webhookSend or receive an HTTP request as part of the workflow. Useful for triggering external services.

Retry Policies

Each node can be configured with a retry policy that defines what happens when it fails:

Routing Ports

Nodes have routing ports that determine execution flow:

Workflow Features

Workflow Execution Tracking

Each workflow run is tracked with per-step execution status. The workflow run record includes:

View workflow runs from Settings > Workflows or via the /api/workflow/runs/{run_id} endpoint.

Workflow Best Practices

Cloud Providers (Direct API)

Aura Workshop connects directly to the following provider APIs. Each provider is shown as an expandable card on the Models page.

ProviderAPI Base URLModels
Anthropicapi.anthropic.comClaude family (Opus, Sonnet, Haiku) for advanced reasoning, analysis, and code
OpenAIapi.openai.comGPT series (GPT-4o, o3, o4-mini) for versatile general-purpose tasks
Googlegenerativelanguage.googleapis.comGemini multimodal models with large context windows
MiniMaxapi.minimax.ioMiniMax models for text, voice, and video generation

Provider Authentication Details

ProviderAuth TypeHeaderFree Tier
AnthropicAPI key headerx-api-key: sk-ant-...No
OpenAIBearer tokenAuthorization: Bearer sk-...No
GoogleQuery parameter?key=AIza...Yes (Flash, Flash-Lite)
MiniMaxBearer tokenAuthorization: Bearer ...No

Configuring a Cloud Provider

  1. Navigate to the Models page by clicking the Models icon in the sidebar.
  2. Click a provider card to expand it.
  3. Enter your API key in the key field.
  4. Select a model from the dropdown (pre-populated with available models for that provider).
  5. Adjust sampling parameters if needed (temperature, top_p, top_k, max_tokens).
  6. Click Save or Apply. The model immediately becomes available in the quick model switcher.

Aggregator Services

Aggregator services provide access to many models through a single API key. All use OpenAI-compatible APIs with Bearer token authentication.

AggregatorHighlights
OpenRouterUnified API for 200+ models with automatic fallback. Many free model options available.
Together AIFast inference for open-source models: Llama, Qwen, DeepSeek, Mistral, Gemma.
GroqUltra-fast LPU inference with sub-second latency. Free tier with rate limits.
DeepSeekHigh-performance reasoning and coding at low cost.
SiliconFlowCost-effective GPU cloud inference for Qwen and DeepSeek models.
Zhipu AI (z.ai)GLM models with free tiers available. Auto-seeds MCP servers for vision and code reading.
Xiaomi / MiMoXiaomi's MiMo models for general tasks.
Moonshot / KimiMultilingual, long-context models optimized for Asian languages.
Mistral AIEuropean models strong in coding and multilingual tasks.

Local Providers

Run models on your own hardware with zero API cost. Aura Workshop auto-detects local inference servers on standard ports.

ProviderDefault PortDescription
Aura AI8080Bundled inference engine (Go + llama.cpp) with Metal/CUDA/CPU support.
Ollama11434Popular local model runner with a simple pull-and-run workflow.
LM Studio1234Desktop app for running local models with an OpenAI-compatible API.
LocalAIvariesSelf-hosted AI inference with OpenAI-compatible endpoints.
vLLMvariesHigh-throughput LLM serving with PagedAttention for production workloads.
TGIvariesHuggingFace Text Generation Inference server.
SGLangvariesFast serving framework for large language models.

Custom Providers

Add any OpenAI-compatible endpoint as a custom provider. This allows you to connect to any inference service that speaks the OpenAI API format.

Aura AI (Built-in Local Inference)

Aura Workshop bundles the aura-inference engine (Go + llama.cpp) for running GGUF models locally. It supports Metal acceleration on macOS, CUDA on Linux and Windows, and CPU fallback. Zero API cost: everything runs entirely on your hardware.

Start / Stop Controls

On the Models page, the Aura AI section provides a Launch button to start the local inference server and a Stop button to shut it down. When running, the status indicator turns green.

Port Configuration

The default port is 8080. You can change it in the Aura AI settings panel. Make sure the chosen port is not in use by another application.

Model Selector

Click Scan to detect GGUF models in ~/.cache/huggingface/hub/ and ~/.cache/aura-inference/models/. Select a model from the dropdown. Both HuggingFace-cached and directly downloaded GGUF files are discovered.

Advanced Parameters

ParameterDescriptionDefault
GPU LayersNumber of model layers offloaded to GPU. Set to -1 to offload all layers.-1
Context SizeToken context window size.4096
Batch SizeInference batch size for throughput optimization.512
Flash AttentionEnable flash attention for faster inference on supported hardware.Off
KV Cache Key TypeData type for the key cache: q8_0 (quantized) or f16 (full precision).q8_0
KV Cache Value TypeData type for the value cache: q8_0 or f16.q8_0
Thinking ModeEnable extended reasoning/chain-of-thought for local models that support it.Off

HuggingFace Model Downloader

Download GGUF models directly from HuggingFace without leaving the app. Nine curated models are available with one-click download:

ModelSizeBest For
Qwen3.5 9B~5.8 GBBest balance of quality and speed for general tasks
Qwen3 Coder 8B~5 GBCode generation and programming tasks
Qwen3 VL 8B~5 GBVision + language multimodal tasks (image analysis)
Llama 3.3 70B~42 GBTop-tier quality for complex reasoning (requires significant RAM/VRAM)
Llama 3.1 8B~4.9 GBReliable tool calling and function execution
Mistral Small 24B~14 GBMultilingual tasks with vision capabilities
Phi-4 14B~8.4 GBStrong reasoning and math
Gemma 4 E4B~5 GBCompact vision model for image understanding
Gemma 4 27B-A4B~16.8 GBMixture of Experts model with excellent efficiency

Curated Model Table

Each model row in the table has:

Custom Model Download

Enter any HuggingFace repo ID (for example, Owner/ModelName-GGUF) in the custom download input. The app auto-resolves GGUF repos and downloads Q4_K_M quantization by default. Provide a HuggingFace token for gated models. Downloads are verified against SHA2 checksums.

Ollama Integration

If Ollama is running locally on http://localhost:11434, Aura Workshop auto-detects it and lists available models. No API key is required.

Pull Input

On the Models page under the Ollama section, enter a model name (for example, llama3.1) in the pull input field and click the Pull button. Ollama downloads the model and makes it available immediately.

List and Delete

The Ollama section shows a list of all locally available Ollama models. Each model has a Delete button to remove it.

Inference Cluster (LAN GPU Sharing)

The inference cluster feature enables distributed GPU inference across multiple machines on your local network. Combine the GPU resources of several computers to run larger models than any single machine could handle.

Discovery

Aura Workshop nodes broadcast their presence via UDP on port 18801 every 30 seconds. Nodes that have not broadcast for 90 seconds are considered expired and removed from the cluster view. LAN discovery can be toggled with the lan_discovery_enabled setting.

Roles

Worker Claiming

The master discovers workers via UDP broadcast. To claim a worker, the master sends an HTTP POST to the worker's /api/cluster/join endpoint. The worker stores the master's information and starts its RPC server. The claiming process works as follows:

  1. The master machine's Models page shows a "LAN Nodes" section listing all discovered workers on the network.
  2. Each discovered worker shows its hostname, IP address, available GPU resources, and current state.
  3. Click the Add button next to a discovered worker to claim it.
  4. The master sends a claim request to the worker. The worker accepts and starts its RPC server on port 50052.
  5. Once claimed, the worker appears in the "Claimed Workers" list.
  6. When you start inference, the master distributes model layers across all claimed workers based on their GPU capabilities.

Workshop Modes

Each Aura Workshop instance can operate in one of three cluster modes, configured in the Models page:

Parameter Control

Cluster inference supports the same parameters as standalone Aura AI: quantization, context size, batch size, GPU layers, and flash attention. Parameters are set on the master and applied across the cluster.

Docker Cluster Deployment

# Master node
docker run -d --net=host --gpus all \
  -e AURA_DAEMON_MODE=inference-master \
  coolkoo/aura-workshop:daemon-latest \
  --mode inference-master

# Worker node
docker run -d --net=host --gpus all \
  -e AURA_DAEMON_MODE=inference-worker \
  coolkoo/aura-workshop:daemon-latest \
  --mode inference-worker

Use --net=host for UDP broadcast discovery and --gpus all for NVIDIA GPU access.

Quick Model Switching

Click the model name displayed in the top bar at any time to open a dropdown of all configured and available models. The dropdown groups models by provider for easy navigation. Selecting a model instantly switches the active model for all subsequent new tasks. Models from cloud providers, aggregators, local inference, and custom providers are all listed together. A search field at the top of the dropdown lets you filter models by name.

Model Auto-Detection

When you configure a provider's API key, Aura Workshop automatically queries the provider to discover available models. This means the model selector always shows current, accurate model options without manual configuration. For local providers (Ollama, Aura AI), clicking Scan refreshes the local model list.

Model Parameters

Each model can be configured with sampling parameters that affect output quality and style:

ParameterDefaultDescription
Temperature0.7Controls randomness. Lower values produce more focused output; higher values are more creative.
Top P0.8Nucleus sampling threshold. Only tokens with cumulative probability above this threshold are considered.
Top K20Only the top K most likely tokens are considered at each step. 0 disables Top K filtering.
Min P0.0Minimum probability threshold. Tokens below this probability are discarded.
Repeat Penalty1.0Penalty applied to repeated tokens. Values above 1.0 discourage repetition.
Max Tokens4096Maximum number of tokens the model can generate in a single response.
Thinking LevelOffExtended reasoning depth: Off, Low, Medium, High. Only available on models that support thinking.

Smart Routing

Smart routing automatically selects the most cost-effective model for each task based on its complexity.

Enable / Disable

Toggle smart routing in Settings > Routing. When disabled, all tasks use the globally selected model.

Tier Configuration

Four routing tiers are available, each with its own model selector and boundary score:

TierIntended Use
SimpleQuick questions, translations, simple formatting. Routed to the cheapest model.
StandardModerate tasks: writing, analysis, single-step coding.
ComplexMulti-step tasks: architecture, debugging, research. Routed to a capable model.
ReasoningTasks requiring deep reasoning, math, or extended chain-of-thought. Routed to the most powerful model.

Free Models Only Toggle

A toggle restricts routing to free models only (such as those available on OpenRouter or Groq free tier).

Analytics View

The routing analytics section shows detailed statistics about smart routing performance:

The analytics update in real-time as new tasks are routed. Reset the analytics data from the billing reset button.

Settings: General

Context Compression

Aura Workshop automatically manages context window limits to prevent long-running tasks from failing due to token limits:

  1. Threshold: Compaction triggers when context usage reaches 70% of the active model's context window.
  2. LLM summarization: Older conversation history is summarized by the LLM into a condensed form.
  3. Pair preservation: tool_use and tool_result message pairs are always kept intact during compaction.
  4. Truncation fallback: If summarization fails, the system falls back to mechanical truncation of the oldest messages.
  5. Circuit breaker: After 3 consecutive compaction failures, the system switches to permanent truncation mode to ensure the task can continue.
  6. Persistent state: The compressed state is saved to the task_memory database table, so progress survives crashes and can be resumed.

Crash Safety and Task Resume

The full conversation transcript is persisted to the SQLite database before every API call. If the application crashes or is force-quit, no conversation data is lost. Key features:

Provider Health and Circuit Breaker

Provider health is tracked passively after each API request (zero overhead). If a provider fails 5 consecutive times, the circuit breaker opens and the agent automatically switches to the next provider in the fallback order. The circuit recovers after 60 seconds with a single probe request. You can view provider health status in Settings > Billing.

Execution Modes

ModeDescription
NativeCommands run directly on your host OS via sh -c (macOS/Linux) or cmd /C (Windows). Default when Docker is not available. No container overhead.
DockerCommands run in isolated Docker containers. Provides sandboxing and dependency isolation at the cost of startup latency.

Execution Backend

The execution_backend setting determines where agent commands physically run:

Settings: Security

Settings: Connectivity

SettingDescriptionDefault
Web Server EnabledToggle the embedded HTTP server on or offOn
PortHTTP port for the web server18800
Auth TokenOptional Bearer token for remote access authentication. When set, all API requests must include this token.Empty (no auth)

Environment variable overrides: AURA_WEB_ENABLED, AURA_WEB_PORT, AURA_WEB_TOKEN.

Settings: Integrations

Settings: Skills

Settings: Plugins

Plugins extend Aura Workshop with additional capabilities beyond the built-in features.

Plugins are managed via the REST API at /api/plugins for programmatic control.

Settings: Design

The Design settings control visual aspects used by the design skills and generated artifacts.

Design tokens are available to agents via the /api/design-systems endpoint and are used by the design skills (taste-skill, impeccable, ui-ux-pro-max, etc.) when generating UI artifacts.

Settings: MCPs

Manage MCP server connections (see the MCP Servers section for full details). This tab provides the same interface described there: add, edit, delete, connect, disconnect, configure isolation mode, set OAuth credentials, and import from JSON.

Settings: Data Management

ActionDescription
Clear conversation historyDelete all task messages and conversation data from the database. Tasks themselves are preserved.
Reset API keysRemove all stored provider API keys from the encrypted credential store.
Clear model cacheRemove cached model metadata and force re-fetching from providers.
Reset databaseFull database reset to factory defaults. All settings, tasks, conversations, and configurations are deleted.
Reset app dataComplete application reset including all settings, database, downloaded models, and cached files.
DiagnosticsView system information (OS, architecture, memory, disk), database statistics (table counts, sizes), and dependency status.

Each destructive action requires confirmation before execution.

Settings: Teams

Settings: Roles & Prompts

Settings: Workflows

The visual workflow editor provides the canvas-based interface described in the Workflows section. From this settings tab you can:

Settings: Credentials

Settings: Cloud Storage

Connected cloud storage is used by agents for storing generated files, backups, and shared artifacts.

Settings: Memory

Settings: Routing

Settings: Billing & Usage

Settings: Commands

Settings: Models

Settings: Updates

Spend Tracking & Billing

Spend Summary Cards

At the top of the Billing tab, four summary cards provide at-a-glance metrics:

Daily Usage Chart

An interactive bar chart visualizes daily spend over the last 14 days. Hover over any bar to see the exact cost for that day. Below the bar chart, area charts show per-model daily usage so you can identify which models are driving costs.

Model Usage Breakdown Table

A detailed table lists every model used during the billing period, with columns for model name, provider, input tokens, output tokens, number of requests, and total cost.

Spend Limits

Set a maximum monthly spend for each provider. The system enforces the limit by automatically switching to the next provider in the fallback order when a limit is reached. You can also set alert thresholds that trigger a notification before the hard limit is hit.

Provider Fallback Order

Define the priority sequence for provider switching. Drag and drop providers to reorder them. When the primary provider hits its spend limit, encounters rate limits (HTTP 429), or returns server errors (HTTP 503), the system transparently switches to the next provider in the list.

Provider Pricing Override

The model pricing table shows input and output cost per million tokens for every model. These values are pre-seeded with current market rates but can be edited manually. A "Reset to Defaults" button restores the original pricing data.

Usage Stats in Context Panel

During task execution, the Context Panel (right sidebar) shows real-time usage: Context Usage percentage (with color-coded bar), Input Tokens, Output Tokens, Cache Read tokens, Total Cost, Latency, Model name, and Provider name.

Web UI (Browser Access)

Aura Workshop includes an embedded axum HTTP server that serves the full SolidJS frontend through any web browser. Port 18800 is the default.

Enabling the Web Server

  1. Go to Settings > Connectivity.
  2. Toggle Web UI Server on.
  3. Set the port (default: 18800).
  4. Optionally set a Bearer token for authentication.
  5. Open http://localhost:18800 in any browser on the network.

Authentication

When a Bearer token is configured, all API requests must include Authorization: Bearer your-secret-token in the headers. The /api/health and /api/heartbeat/incoming endpoints are always public and do not require authentication.

Environment Variable Configuration

export AURA_WEB_ENABLED=true
export AURA_WEB_PORT=18800
export AURA_WEB_TOKEN=your-secret-token

Feature Parity

The browser-based UI has full feature parity with the desktop app: all navigation views, real-time streaming via SSE, the complete REST API, file upload, folder selection, voice input, and all settings tabs.

Remote Agent Deployment

Deploy AI agents to remote machines via SSH for always-on, headless operation.

How Deployment Works

  1. Go to Settings > Agents or use the deploy_remote tool.
  2. Provide SSH connection details: hostname, username, and authentication method (key file, password, or saved credential).
  3. Select the deployment mode: Full Agent, Inference Master, Inference Worker, or Worker.
  4. Aura Workshop connects via SSH, detects the remote OS, auto-installs Docker if needed, pulls the Docker image, and starts the daemon container with --net=host.

SSH Authentication Methods

MethodDescription
Key filePath to an SSH private key file (RSA, Ed25519, etc.). The most secure and recommended method.
PasswordUsername/password authentication. Uses sshpass for non-interactive authentication.
Saved credentialReferences a credential from the encrypted credential store by ID. Keys are decrypted at deployment time.

Deployment Modes

ModeDescription
Full AgentComplete daemon with tasks, inference, listeners, webhooks, and workflows.
Inference MasterManages an inference cluster: models, workers, serves the inference API.
Inference WorkerBroadcasts on LAN, joins a master's cluster, contributes GPU resources via RPC.
WorkerTask execution worker that uses a master's inference endpoint for model access.

Monitoring

Deploying Automation to Remote Agents

Schedules, listeners, and webhooks can be deployed to remote machines. When you create an automation item and select a target agent, the configuration is synced to the remote deployment. Deleting an automation item also cleans up its remote deployment automatically.

Docker Daemon

The aura-daemon binary provides headless operation for Docker deployments.

Repository and Tags

4 Daemon Modes

ModeFlagPurpose
full--mode fullComplete daemon: tasks, inference, listeners, webhooks, workflows, and full REST API
inference-master--mode inference-masterManages the inference cluster: model distribution, worker coordination, serves the inference API
inference-worker--mode inference-workerBroadcasts on LAN, joins a master's cluster, contributes GPU resources via RPC
worker--mode workerTask execution worker that uses a master's inference endpoint for model access

Running the Daemon

docker run -d \
  --name aura-daemon \
  --net=host \
  -v ~/.aura:/root/.aura \
  -v $(pwd)/data:/data \
  -e AURA_WEB_ENABLED=true \
  -e AURA_WEB_PORT=18800 \
  -e AURA_API_KEY=your-api-key \
  -e AURA_MODEL=deepseek-chat \
  -e AURA_BASE_URL=https://api.deepseek.com \
  coolkoo/aura-workshop:daemon-latest \
  --mode full

Environment Variables

VariableDescriptionDefault
AURA_WEB_ENABLEDEnable the embedded web servertrue
AURA_WEB_PORTWeb server port18800
AURA_WEB_TOKENBearer token for API authentication(none)
AURA_DB_PATHSQLite database file path/data/aura-workshop.db
AURA_API_KEYAPI key for the LLM provider(none)
AURA_MODELModel identifier to use(none)
AURA_BASE_URLBase URL for the LLM provider API(none)
AURA_NATIVE_MODEUse native mode instead of Docker-in-Dockerfalse
AURA_REMOTE_DEPLOYMENTMark as a remote deploymentfalse
AURA_VIEWER_MODEServe the viewer SPA instead of the full UIfalse
AURA_PAIRING_CODEOne-time pairing code for desktop app connection(none)
AURA_HEARTBEAT_URLURL to send heartbeats to the master instance(none)
AURA_DEPLOYMENT_IDUnique identifier for this deployment(none)
AURA_DAEMON_MODEDaemon operating modefull
AURA_RPC_PORTRPC port for distributed inference50052
AURA_MASTER_INFERENCE_URLURL of the master's inference API (worker mode)(none)

Health Check

The Docker image includes a built-in health check:

HEALTHCHECK --interval=30s --timeout=5s --retries=3
    CMD curl -f http://localhost:18800/api/health || exit 1

What Is Included in the Image

Volume Mounts

The recommended volume mounts for a full daemon deployment:

docker run -d \
  --name aura-daemon \
  --net=host \
  -v ~/.aura:/root/.aura \                    # User memory, roles, CLI config
  -v $(pwd)/data:/data \                       # Database storage
  -v $(pwd)/models:/root/.cache/aura-inference/models \  # GGUF models
  coolkoo/aura-workshop:daemon-latest \
  --mode full
Host PathContainer PathPurpose
~/.aura/root/.auraUser memory files, custom roles, CLI configuration
./data/dataSQLite database, charts, generated files
./models/root/.cache/aura-inference/modelsDownloaded GGUF model files

GPU Passthrough

For NVIDIA GPUs on Linux, use the --gpus all flag to pass through GPU resources:

docker run -d --net=host --gpus all \
  -v $(pwd)/models:/root/.cache/aura-inference/models \
  -v $(pwd)/data:/data \
  coolkoo/aura-workshop:daemon-latest \
  --mode inference-master

Ensure the NVIDIA Container Toolkit is installed on the host. On macOS with Colima, GPU passthrough requires krunkit.

Auto Docker Installation for Remote Deployment

When deploying to remote machines via SSH, Aura Workshop can auto-install Docker:

Headless Mode

For environments without a display server (headless Linux servers), you can run Aura Workshop without a GUI:

xvfb-run aura-workshop

The application starts without rendering a window but serves the full web UI for browser-based access. For production headless deployments, the Docker daemon is the recommended approach.

Headless mode is configurable via settings: when the headless flag is enabled, the application skips all GUI initialization and runs purely as a server process. All features remain available through the REST API and web UI.

Security Architecture

Encryption at Rest

Authentication

Webhook Security

Incoming webhook requests can be validated using HMAC-SHA256 signatures against a shared secret. Invalid signatures are rejected before processing.

Download Verification

Model downloads from HuggingFace are verified using SHA2 checksums to prevent tampering or corruption.

Command Safety Guardrails

The agent middleware blocks dangerous shell commands automatically and unconditionally. Fork bombs, recursive root deletion, disk formatting, and other destructive patterns are always blocked regardless of configuration. These guardrails cannot be disabled.

Zero Telemetry

Aura Workshop sends zero telemetry and includes no analytics tracking. All data remains on your machine. The application is fully functional in air-gapped environments when paired with local models.

Local-First Architecture

All data is stored in a local SQLite database. There is no cloud sync, no remote storage of your conversations, and no data leaving your machine except the LLM API calls you explicitly configure. When using local inference (Aura AI or Ollama), no data leaves your network at all.

Execution Isolation

Three isolation levels are available for agent commands:

Chrome Extension

Aura Workshop includes a Chrome extension that provides a side panel chat interface and browser automation capabilities.

Side Panel

Context Menus

Right-click on any text, link, or page element to access context menu options for quick agent invocation. For example, you can right-click selected text and choose "Ask Aura about this" to send it directly to the agent with the page context.

Screenshot Capture

The extension can capture screenshots of the current browser tab and send them to the agent for visual analysis.

Browser Automation via Agents

Through the /ws/browser WebSocket connection, agents can control the browser programmatically using the browser_action tool. Supported actions include:

Setup

  1. Load the extension from the extension/ directory in Chrome's developer mode (navigate to chrome://extensions, enable Developer mode, and click "Load unpacked").
  2. Click the extension icon and configure it to point to your Aura Workshop web server URL (for example, http://localhost:18800).
  3. If authentication is enabled on your server, enter the Bearer token in the extension settings.
  4. The extension icon turns green when successfully connected to the server.

Embeddable Chat Widget

Embed an Aura Workshop-powered chat interface on any website with a simple JavaScript snippet.

<script src="http://your-server:18800/widget.js"></script>
<script>
  AuraWidget.init({
    serverUrl: "http://your-server:18800",
    token: "your-auth-token",      // optional
    position: "bottom-right",       // bottom-right or bottom-left
    title: "AI Assistant",          // widget title
    greeting: "How can I help?",    // initial greeting message
    theme: "dark"                   // dark or light
  });
</script>

Features

Voice & TTS

Speech-to-Text Providers

ProviderDescriptionRequirements
WhisperOpenAI's Whisper model for transcriptionOpenAI API key
SystemOperating system's built-in speech recognitionNone
GroqUltra-fast transcription via Groq LPUGroq API key
OpenAIOpenAI's transcription APIOpenAI API key
xAIxAI's transcription servicexAI API key

Text-to-Speech Providers

ProviderDescriptionRequirements
SystemOS built-in TTS (macOS say command, Windows SAPI)None
OpenAI TTSHigh-quality neural text-to-speech via OpenAI APIOpenAI API key
ElevenLabsPremium voice synthesis with a wide selection of voicesElevenLabs API key

Voice Input Button

The Voice button in the task composer activates the microphone. A pulsing red dot appears when recording. Speak your task description and the audio is transcribed using the configured speech-to-text provider and inserted into the textarea.

Configurable Voices and Speech Rate

In Settings > Integrations, select the desired voice from the provider's available options and adjust the speech rate to your preference.

Voice Configuration Details

SettingDescriptionOptions
TTS EnabledMaster toggle for text-to-speech on agent responsesOn/Off
TTS ProviderWhich service generates the speech audioSystem, OpenAI, ElevenLabs
VoiceSpecific voice to use (depends on provider)Provider-specific list
Speech RateSpeed of the generated speech0.5x to 2.0x
STT ProviderWhich service transcribes voice inputWhisper, System, Groq, OpenAI, xAI

When TTS is enabled, agent responses are automatically converted to audio and played through your speakers. You can also trigger speech manually for any specific message.

Cross-Session Memory

Aura Workshop maintains a persistent memory system that learns from every task and adapts to your preferences over time. Memory persists across sessions and even across application restarts.

Memory Types

TypeLocationScopeHow Created
User memory~/.aura/memory/*.mdGlobal (all projects)Agent saves via tool or manual creation
Feedback memorymemory_facts tableGlobal or project-scopedLLM-extracted corrections and reinforcements
Project memory.aura/memory/*.mdPer-projectAuto-extracted from file operations after task completion
Reference memorytask_memory tablePer-taskHandoff context between team roles and compaction summaries

Memory Fact Categories

Facts extracted from conversations are categorized with confidence scores:

CategoryScopeConfidenceExample
preferenceGlobal0.9"User prefers Python over JavaScript for backend"
correctionGlobal0.95"User correction: use FastAPI not Flask"
reinforcementGlobal0.90"User confirmed: single bundled PR is correct"
knowledgeProject0.6-0.9"Project uses PostgreSQL on port 5432"
file_operationProject0.7"File modified: src/models.py"
contextProject0.6-0.8"API routes defined in src/routes/"
behaviorProject0.7"Always run tests after code changes"
goalProject0.7"Goal is to migrate from Express to FastAPI"

Auto-Extraction

After every task, the system runs an LLM call to extract structured facts plus pattern detection for corrections ("no", "wrong", "instead") and reinforcements ("yes", "perfect", "great"). Corrections are saved at 0.95 confidence; reinforcements at 0.90.

AURA.md -- Project Instructions

Drop an AURA.md (or CLAUDE.md) file in your project root with rules the agent must follow. The system scans for: AURA.md, .aura.md, CLAUDE.md, .claude.md, .aura/INSTRUCTIONS.md, scanning from the project root up to two parent directories. Cap: 4 KB per file, 12 KB total.

AURA.md Editor

Create and edit AURA.md from Settings > Memory > Project Instructions. The editor provides:

Git Context Injection

When a task's project is a git repository, the agent automatically receives current git state in its system prompt:

This helps the agent understand what you have been working on and avoid conflicting changes. Toggle this feature in Settings > General > Agent Context.

Memory Viewer

View, search, and delete learned facts in Settings > Memory > Learned Facts. Facts are color-coded by category:

Each fact displays its content, confidence percentage, category, scope (global or project), and creation date. Delete individual facts by clicking the trash icon.

Prompt Cache Stats

The static portion of the system prompt (including memory, AURA.md, and role definitions) is cached by providers that support prompt caching (Anthropic, OpenAI, Google). View cache hit rates and token savings in Settings > Billing > Prompt Cache Stats. Higher cache hit rates mean faster response times and lower costs.

Memory Scope and Isolation

Memory Injection

Before every task, the system prompt is enriched with relevant memory through a multi-layer injection process:

  1. File-based memories: The top 15 entries (approximately 2000 character budget) from ~/.aura/memory/ (user scope) and .aura/memory/ (project scope) are loaded.
  2. Keyword-searched facts: The user's message is tokenized into keywords and searched against the memory_facts table, filtered by project isolation rules (approximately 1500 character budget).
  3. Recent high-confidence facts: The 5 most recent facts with confidence scores at or above 0.8 are always included regardless of keyword matching.
  4. Pre-compaction flush: Before context compaction occurs, file operation facts are extracted from messages that are about to be dropped, ensuring no learnings are lost during compression.

This multi-layer approach ensures that the agent always has access to the most relevant context, from explicit project rules to learned preferences and domain knowledge.

AURA.md vs Memory

Understanding when to use project instructions versus memory:

SituationUseWhy
"Every task in this project must use TypeScript strict"AURA.mdDeterministic, version-controlled rule
"I prefer Python over JavaScript"MemoryAuto-detected as preference across all projects
"Don't use Flask, use FastAPI"MemoryAuto-detected as correction (0.95 confidence)
"All commits must follow conventional format"AURA.mdExplicit project rule for consistency
"The database is PostgreSQL on port 5432"MemoryAuto-extracted project knowledge fact

AURA.md = rules you write explicitly. Deterministic, version-controlled, immediately effective. Memory = facts the agent learns from interactions. Probabilistic, compounds over time, adapts to corrections.

REST API Overview

Aura Workshop exposes approximately 140 REST endpoints through its embedded HTTP server (default port 18800). Every feature available in the desktop app is also accessible via HTTP. All endpoints are prefixed with /api unless otherwise noted.

Base URL: http://localhost:18800/api
Content-Type: application/json for all POST/PUT requests
Authentication: Optional Bearer token (configured in Settings > Connectivity)

Authentication

When a token is configured, include it as a Bearer token in all requests:

curl -H "Authorization: Bearer YOUR_TOKEN" http://localhost:18800/api/tasks

If no token is configured, all API requests are allowed without authentication. The /api/health and /api/heartbeat/incoming endpoints are always public.

Error Handling

All API endpoints return standard HTTP status codes. Error responses include a JSON body with an error field:

# Error response format
{
  "error": "Task not found",
  "code": "NOT_FOUND"
}

# Common HTTP status codes:
# 200 OK         - Request succeeded
# 201 Created    - Resource created
# 400 Bad Request - Invalid request body or parameters
# 401 Unauthorized - Missing or invalid auth token
# 404 Not Found  - Resource does not exist
# 429 Too Many Requests - Rate limited (forwarded from provider)
# 500 Internal Server Error - Server-side error

Pagination

List endpoints that may return large result sets support optional limit and offset query parameters:

# Get the first 10 tasks
curl -H "Authorization: Bearer $TOKEN" \
  "http://localhost:18800/api/tasks?limit=10&offset=0"

# Get the next 10 tasks
curl -H "Authorization: Bearer $TOKEN" \
  "http://localhost:18800/api/tasks?limit=10&offset=10"

Tasks

MethodEndpointDescription
GET/api/tasksList all tasks with status, title, timestamps, and model information
POST/api/tasksCreate a new task with a message, optional file attachments, project path, model, and role
GET/api/tasks/{id}Get a specific task by ID
DELETE/api/tasks/{id}Delete a task and all its messages
GET/api/tasks/{id}/messagesGet all messages for a task
POST/api/tasks/{id}/messagesSend a follow-up message to an existing task
GET/api/tasks/interruptedList all interrupted tasks that can be resumed
GET/api/tasks/{id}/filesList files created or modified by a task
# Create a task
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"message": "Create a Python script that generates prime numbers", "project_path": "/home/user/myproject"}' \
  http://localhost:18800/api/tasks

# Response
{
  "id": "task_abc123",
  "title": "Create a Python script that generates prime numbers",
  "status": "executing",
  "model": "claude-sonnet-4-20250514",
  "created_at": "2026-05-21T10:30:00Z"
}

Get Task Messages

# Get all messages for a task
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/tasks/task_abc123/messages

# Response
{
  "messages": [
    {
      "id": "msg_001",
      "role": "user",
      "content": "Create a Python script that generates prime numbers",
      "timestamp": "2026-05-21T10:30:00Z"
    },
    {
      "id": "msg_002",
      "role": "assistant",
      "content": "I'll create a Python script...",
      "tool_calls": [
        {
          "tool": "write_file",
          "input": {"path": "primes.py", "content": "..."},
          "output": "File written successfully",
          "status": "ok"
        }
      ],
      "timestamp": "2026-05-21T10:30:15Z"
    }
  ]
}

Send a Follow-Up Message

# Send a follow-up message to an existing task
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"message": "Also add a function to check if a number is prime"}' \
  http://localhost:18800/api/tasks/task_abc123/messages

List Interrupted Tasks

# Get all tasks that can be resumed
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/tasks/interrupted

# Response
{
  "tasks": [
    {
      "id": "task_xyz789",
      "title": "Build a REST API",
      "status": "interrupted",
      "interrupted_at": "2026-05-21T09:15:00Z"
    }
  ]
}

Conversations and Messages

MethodEndpointDescription
GET/api/conversationsList all conversations
POST/api/conversationsCreate a new conversation
DELETE/api/conversations/{id}Delete a conversation and its messages
PUT/api/conversations/{id}/titleUpdate a conversation title
GET/api/conversations/{id}/messagesGet all messages in a conversation
POST/api/conversations/{id}/messagesAdd a message to a conversation

Chat and Agent (SSE Streaming)

These endpoints return Server-Sent Events (SSE) streams for real-time output.

MethodEndpointDescription
POST/api/chat/sendSend a chat message and stream the response (no tool access)
POST/api/chat/enhancedSend a chat message with full tool access (streaming)
POST/api/agent/runRun a standalone agent task with full tool access (streaming)
POST/api/tasks/{id}/runRun an existing task (streaming)
POST/api/tasks/{id}/resumeResume an interrupted task (streaming)
GET/api/eventsGlobal SSE event stream for all task, workflow, and system events
POST/api/inference/stopStop inference for a specific task (task ID in request body)
# Run an agent with SSE streaming
curl -N -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"message": "What is the capital of France?"}' \
  http://localhost:18800/api/agent/run

# SSE events received:
# data: {"type":"text","content":"The capital of France is Paris."}
# data: {"type":"done","status":"completed"}

The SSE stream emits events with types: text, tool_use, tool_result, thinking, plan, status, error, and done.

Platform and Settings

MethodEndpointDescription
GET/api/platformReturns the platform identifier (darwin, windows, linux)
GET/api/settingsGet all application settings as a JSON object
PUT/api/settingsSave application settings
POST/api/settings/testTest LLM connection with current settings
POST/api/email/testSend a test email to verify email configuration
GET/api/auth/checkVerify that the provided auth token is valid

Listeners

MethodEndpointDescription
GET/api/listenersList all listeners
POST/api/listenersCreate a new listener
PUT/api/listeners/{id}Update a listener
DELETE/api/listeners/{id}Delete a listener
POST/api/listeners/{id}/startStart a listener
POST/api/listeners/{id}/stopStop a listener
POST/api/listeners/{id}/toggleToggle a listener enabled/disabled
GET/api/listeners/{id}/logsGet event logs for a listener
GET/api/listeners/statusesGet running/stopped status of all listeners
GET/api/listeners/platformsGet the list of supported listener platforms
# Create a Slack listener
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Slack Support Bot",
    "platform": "slack",
    "token": "xoxb-...",
    "trigger_type": "mentions",
    "prompt": "You are a helpful support agent. Answer the user question based on our documentation.",
    "model": "claude-sonnet-4-20250514",
    "enabled": true
  }' \
  http://localhost:18800/api/listeners

# Response
{
  "id": "listener_abc123",
  "name": "Slack Support Bot",
  "platform": "slack",
  "status": "stopped",
  "created_at": "2026-05-21T10:00:00Z"
}

# Start the listener
curl -X POST -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/listeners/listener_abc123/start

# Get event logs
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/listeners/listener_abc123/logs

# Response
{
  "events": [
    {
      "timestamp": "2026-05-21T10:05:00Z",
      "sender": "alice",
      "message": "@bot How do I reset my password?",
      "response": "To reset your password, go to Settings > Account...",
      "channel": "#support"
    }
  ]
}

# Get all supported platforms
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/listeners/platforms

Schedules

MethodEndpointDescription
GET/api/schedulesList all scheduled tasks
POST/api/schedulesCreate a new scheduled task
PUT/api/schedules/{id}Update a scheduled task
DELETE/api/schedules/{id}Delete a scheduled task
POST/api/schedules/{id}/toggleToggle a scheduled task enabled/disabled
# Create a daily schedule
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Daily Report",
    "schedule_type": "daily",
    "time": "09:00",
    "message": "Generate a summary of yesterday git commits",
    "duration_type": "forever"
  }' \
  http://localhost:18800/api/schedules

Webhooks

MethodEndpointDescription
GET/api/webhooksList all webhooks
POST/api/webhooksCreate a new webhook
PUT/api/webhooks/{id}Update a webhook
DELETE/api/webhooks/{id}Delete a webhook
POST/api/webhooks/{id}/toggleToggle a webhook enabled/disabled
GET/api/webhooks/{id}/urlGet the auto-generated URL for a webhook
GET/api/webhooks/{id}/logsGet invocation logs for a webhook
# Create a webhook
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "GitHub PR Webhook",
    "prompt": "A GitHub pull request event was received: {{payload}}. Review the PR changes and provide feedback.",
    "secret": "my-hmac-secret",
    "model": "claude-sonnet-4-20250514"
  }' \
  http://localhost:18800/api/webhooks

# Response
{
  "id": "webhook_abc123",
  "name": "GitHub PR Webhook",
  "url": "http://localhost:18790/webhooks/webhook_abc123",
  "enabled": true,
  "created_at": "2026-05-21T10:00:00Z"
}

# Get the webhook URL
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/webhooks/webhook_abc123/url

# Test the webhook with cURL
curl -X POST http://localhost:18790/webhooks/webhook_abc123 \
  -H "Content-Type: application/json" \
  -H "X-Hub-Signature-256: sha256=..." \
  -d '{"action":"opened","pull_request":{"title":"Add user auth","number":42}}'

Workflow API Examples

# Create a workflow
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Code Review Pipeline",
    "nodes": [
      {
        "id": "n1",
        "type": "agent_task",
        "prompt": "Analyze the code for bugs and security issues",
        "model": "claude-sonnet-4-20250514"
      },
      {
        "id": "n2",
        "type": "human_loop",
        "prompt": "Review the analysis and approve or reject"
      },
      {
        "id": "n3",
        "type": "agent_task",
        "prompt": "Generate a fix for the identified issues"
      }
    ],
    "edges": [
      {"from": "n1", "to": "n2", "port": "pass"},
      {"from": "n2", "to": "n3", "port": "approve"}
    ]
  }' \
  http://localhost:18800/api/workflows

# Run a workflow
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"input": {"repo": "my-org/my-repo", "branch": "feature/auth"}}' \
  http://localhost:18800/api/workflows/workflow_abc123/run

# Response
{
  "run_id": "run_xyz789",
  "workflow_id": "workflow_abc123",
  "status": "running",
  "started_at": "2026-05-21T10:00:00Z"
}

# Check workflow run status
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/workflow/runs/run_xyz789

# Approve a human loop step
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"action": "approve", "comment": "Looks good, proceed with the fix"}' \
  http://localhost:18800/api/workflow/approvals/approval_abc/resolve

Skills and Role Skills

MethodEndpointDescription
GET/api/skillsList all installed skills
DELETE/api/skills/{name}Delete a skill by name
GET/api/skills/{name}/contentGet the content/prompt of a skill
PUT/api/skills/{name}/contentUpdate a skill's content
GET/api/skill-settingsGet skill settings
PUT/api/skill-settingsUpdate skill settings
GET/api/role-skillsList all role skills
POST/api/role-skillsSave (create or update) a role skill
GET/api/role-skills/{name}Get a specific role skill by name
DELETE/api/role-skills/{name}Delete a role skill

MCP Servers

MethodEndpointDescription
GET/api/mcp/serversList all configured MCP servers
POST/api/mcp/serversSave (create or update) an MCP server configuration
DELETE/api/mcp/servers/{id}Delete an MCP server configuration
POST/api/mcp/servers/{id}/connectConnect to an MCP server
POST/api/mcp/servers/{id}/disconnectDisconnect from an MCP server
GET/api/mcp/statusesGet connection status for all MCP servers
GET/api/mcp/toolsList all tools from connected MCP servers
# Add an MCP server (stdio transport)
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "File System MCP",
    "transport": "stdio",
    "command": "npx",
    "args": ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/projects"],
    "env": {},
    "isolation": "shared"
  }' \
  http://localhost:18800/api/mcp/servers

# Add an MCP server (HTTP transport with OAuth)
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Cloud MCP",
    "transport": "http",
    "url": "https://mcp.example.com/sse",
    "oauth": {
      "client_id": "my-client-id",
      "client_secret": "my-client-secret"
    },
    "isolation": "per_task"
  }' \
  http://localhost:18800/api/mcp/servers

# Connect to an MCP server
curl -X POST -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/mcp/servers/mcp_abc123/connect

# List all available MCP tools
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/mcp/tools

# Response
{
  "tools": [
    {
      "name": "mcp_filesystem_read_file",
      "server": "File System MCP",
      "description": "Read a file from the filesystem",
      "input_schema": {
        "type": "object",
        "properties": {
          "path": {"type": "string", "description": "File path to read"}
        },
        "required": ["path"]
      }
    }
  ]
}

# Get connection statuses
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/mcp/statuses

# Response
{
  "statuses": {
    "mcp_abc123": {"connected": true, "tool_count": 5},
    "mcp_def456": {"connected": false, "error": "Connection refused"}
  }
}

Plugins

MethodEndpointDescription
GET/api/pluginsList all installed plugins
POST/api/pluginsInstall or update a plugin
DELETE/api/plugins/{id}Uninstall a plugin
POST/api/plugins/{id}/toggleEnable or disable a plugin

Workflows

MethodEndpointDescription
GET/api/workflowsList all automation workflows
POST/api/workflowsCreate a new workflow
GET/api/workflows/{id}Get a specific workflow
PUT/api/workflows/{id}Update a workflow
DELETE/api/workflows/{id}Delete a workflow
POST/api/workflows/{id}/runRun a workflow
GET/api/workflow/runs/{run_id}Get the status of a workflow run
POST/api/workflow/approvals/{id}/resolveResolve a human approval request (approve or reject)

Teams

MethodEndpointDescription
GET/api/teamsList all teams
POST/api/teamsCreate a new team
PUT/api/teams/{id}Update a team
DELETE/api/teams/{id}Delete a team
POST/api/teams/runRun a team task
# Run a team task
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "team_id": "software-dev-team",
    "message": "Build a REST API for user management with authentication"
  }' \
  http://localhost:18800/api/teams/run

Credentials

MethodEndpointDescription
GET/api/credentialsList all stored credentials (metadata only, values not returned)
POST/api/credentialsSave a new credential (value is encrypted before storage)
GET/api/credentials/{id}Get a credential with decrypted value (biometric auth gated)
DELETE/api/credentials/{id}Delete a credential
# Save a credential
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Slack Bot Token",
    "type": "token",
    "value": "xoxb-1234567890-abcdefghij"
  }' \
  http://localhost:18800/api/credentials

# Response
{
  "id": "cred_abc123",
  "name": "Slack Bot Token",
  "type": "token",
  "created_at": "2026-05-21T10:00:00Z"
}

# List credentials (values are never included in list responses)
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/credentials

# Response
{
  "credentials": [
    {"id": "cred_abc123", "name": "Slack Bot Token", "type": "token", "created_at": "2026-05-21T10:00:00Z"},
    {"id": "cred_def456", "name": "SSH Deploy Key", "type": "ssh_key", "created_at": "2026-05-20T14:00:00Z"}
  ]
}

# Get decrypted value (requires biometric auth on desktop)
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/credentials/cred_abc123

Cloud Storage

MethodEndpointDescription
GET/api/cloud/connectorsList cloud storage connectors
POST/api/cloud/connectorsSave (create or update) a cloud connector
DELETE/api/cloud/connectors/{id}Delete a cloud connector
# Save a cloud storage connector
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Production S3",
    "type": "s3",
    "config": {
      "bucket": "my-aura-files",
      "region": "us-east-1",
      "access_key": "AKIAIOSFODNN7EXAMPLE",
      "secret_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
    }
  }' \
  http://localhost:18800/api/cloud/connectors

# List connectors
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/cloud/connectors

# Response
{
  "connectors": [
    {
      "id": "cloud_abc123",
      "name": "Production S3",
      "type": "s3",
      "status": "connected"
    }
  ]
}

Aura AI and Models

MethodEndpointDescription
GET/api/aura/statusGet the current status of the local inference server
GET/api/aura/models/hfScan for HuggingFace-cached models
GET/api/aura/models/ggufScan for local GGUF model files
GET/api/aura/models/curatedGet the curated model list (recommended downloads)
POST/api/aura/models/downloadStart downloading a model from HuggingFace
POST/api/aura/models/download/cancelCancel an in-progress model download
POST/api/aura/models/deleteDelete a downloaded local model
GET/api/aura/lan-nodesList discovered LAN inference nodes
# Check inference server status
curl -H "Authorization: Bearer $TOKEN" http://localhost:18800/api/aura/status

# Response
{
  "running": true,
  "model": "qwen3-8b-q4_k_m.gguf",
  "port": 8080,
  "gpu_layers": -1
}

Inference Cluster

MethodEndpointDescription
GET/api/cluster/statusGet cluster status (loaded model, workers, running state)
GET/api/cluster/workersList discovered and claimed workers
POST/api/cluster/workers/{id}/addClaim a discovered worker (master-side)
POST/api/cluster/workers/{id}/removeDisconnect a claimed worker (master-side)
POST/api/cluster/inference/startStart distributed inference with full parameters
POST/api/cluster/inference/stopStop distributed inference
POST/api/cluster/joinAccept a master's claim request (worker-side)
POST/api/cluster/leaveLeave the cluster (worker-side)
GET/api/cluster/worker/statusReport worker status (worker-side)

Billing and Spend

MethodEndpointDescription
GET/api/billing/summaryGet usage summary grouped by provider
GET/api/billing/limitsGet spend limits for all providers
POST/api/billing/limitsSave a spend limit for a provider
GET/api/billing/fallback-orderGet the provider fallback order
POST/api/billing/fallback-orderSave the provider fallback order
GET/api/billing/pricingGet model pricing table
POST/api/billing/pricingSave model pricing entries
POST/api/billing/resetReset all usage tracking data
GET/api/billing/dailyGet daily usage statistics
GET/api/billing/daily-by-modelGet daily usage broken down by model
GET/api/routing/statsGet smart routing statistics (tasks per tier, cost savings)
# Get billing summary
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/billing/summary

# Response
{
  "today": {
    "total_cost": 2.45,
    "total_input_tokens": 125000,
    "total_output_tokens": 45000
  },
  "month": {
    "total_cost": 38.72,
    "by_provider": {
      "anthropic": {"cost": 28.50, "input_tokens": 1200000, "output_tokens": 450000},
      "openai": {"cost": 8.22, "input_tokens": 500000, "output_tokens": 200000},
      "groq": {"cost": 2.00, "input_tokens": 800000, "output_tokens": 300000}
    }
  }
}

# Set a spend limit
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"provider": "anthropic", "monthly_limit": 50.00, "alert_threshold": 40.00}' \
  http://localhost:18800/api/billing/limits

# Set provider fallback order
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"order": ["anthropic", "openai", "groq", "deepseek"]}' \
  http://localhost:18800/api/billing/fallback-order

# Get daily usage
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/billing/daily

# Response
{
  "daily": [
    {"date": "2026-05-21", "cost": 2.45, "requests": 15},
    {"date": "2026-05-20", "cost": 3.12, "requests": 22},
    {"date": "2026-05-19", "cost": 1.87, "requests": 10}
  ]
}

# Get smart routing statistics
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/routing/stats

# Response
{
  "total_tasks": 150,
  "by_tier": {
    "simple": {"count": 60, "actual_cost": 1.20, "hypothetical_cost": 12.00},
    "standard": {"count": 50, "actual_cost": 8.50, "hypothetical_cost": 15.00},
    "complex": {"count": 30, "actual_cost": 15.00, "hypothetical_cost": 18.00},
    "reasoning": {"count": 10, "actual_cost": 8.00, "hypothetical_cost": 8.00}
  },
  "total_savings": 19.30
}

Voice

MethodEndpointDescription
GET/api/voice/voicesList available TTS voices for the current provider
POST/api/voice/transcribeTranscribe an audio file to text (multipart form data)
POST/api/voice/save-tempSave a temporary audio file from a recording
POST/api/voice/speakConvert text to speech and return audio data
# Transcribe audio
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -F "[email protected]" \
  http://localhost:18800/api/voice/transcribe

# Response
{"text": "Create a REST API for managing user accounts"}

Files

MethodEndpointDescription
GET/api/filesDownload a file by path (query parameter: path)
POST/api/files/uploadUpload a file (multipart form data)
GET/api/files/listList files in a directory

Data Management

MethodEndpointDescription
POST/api/data/clear-historyClear all conversation and task history
POST/api/data/reset-keysReset all stored API keys
POST/api/data/reset-databaseReset the database to factory defaults
POST/api/data/clear-model-cacheClear cached model metadata
POST/api/data/reset-allReset all application data

Memory

MethodEndpointDescription
GET/api/memoriesList all saved agent memories
POST/api/memoriesSave a new memory entry
DELETE/api/memories/{name}Delete a specific memory entry
# List all memories
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/memories

# Response
{
  "memories": [
    {
      "name": "user-preferences.md",
      "type": "user",
      "content": "# User Preferences\n- Prefers Python over JavaScript\n- Uses FastAPI for APIs\n- Follows conventional commits",
      "scope": "global",
      "updated_at": "2026-05-21T09:00:00Z"
    },
    {
      "name": "project-context.md",
      "type": "project",
      "content": "# Project Context\n- Uses PostgreSQL on port 5432\n- Test framework: pytest",
      "scope": "project",
      "project_path": "/home/user/myproject",
      "updated_at": "2026-05-20T15:00:00Z"
    }
  ]
}

# Save a new memory
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "team-conventions.md",
    "type": "reference",
    "content": "# Team Conventions\n- All PRs require two approvals\n- Use squash merge only"
  }' \
  http://localhost:18800/api/memories

# Delete a memory
curl -X DELETE -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/memories/team-conventions.md

Projects

MethodEndpointDescription
GET/api/projectsList all projects
POST/api/projectsCreate a new project
GET/api/projects/{id}Get a specific project
PUT/api/projects/{id}Update a project
DELETE/api/projects/{id}Delete a project
GET/api/projects/{id}/tasksList tasks associated with a project
GET/api/projects/{id}/memoryGet project-scoped memory facts
# Create a project
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "My Web App",
    "path": "/home/user/projects/webapp"
  }' \
  http://localhost:18800/api/projects

# Response
{
  "id": "proj_abc123",
  "name": "My Web App",
  "path": "/home/user/projects/webapp",
  "task_count": 0,
  "created_at": "2026-05-21T10:00:00Z"
}

# List tasks for a project
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/projects/proj_abc123/tasks

# Get project-scoped memory facts
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/projects/proj_abc123/memory

# Response
{
  "facts": [
    {
      "category": "knowledge",
      "content": "Project uses PostgreSQL on port 5432",
      "confidence": 0.85,
      "created_at": "2026-05-20T14:00:00Z"
    },
    {
      "category": "file_operation",
      "content": "File modified: src/models/user.py - added email validation",
      "confidence": 0.7,
      "created_at": "2026-05-21T09:30:00Z"
    }
  ]
}

Viewer and Remote Deployment

MethodEndpointDescription
GET/api/viewer/statusGet deployment status including daemon_mode and uptime
GET/api/viewer/itemsGet deployed items (schedules, listeners, webhooks)
GET/api/viewer/tasksGet recent tasks on the remote deployment
GET/api/viewer/tasks/{id}/messagesGet task messages for a remote task
GET/api/viewer/eventsSSE event stream for real-time remote monitoring
GET/api/remote-deploymentsList all remote agent deployments
# Get remote deployment status
curl -H "Authorization: Bearer $TOKEN" \
  http://remote-host:18800/api/viewer/status

# Response
{
  "daemon_mode": "full",
  "uptime_seconds": 86400,
  "version": "1.30.1",
  "platform": "linux",
  "active_tasks": 2,
  "active_listeners": 5,
  "active_schedules": 3
}

# List deployed items on a remote agent
curl -H "Authorization: Bearer $TOKEN" \
  http://remote-host:18800/api/viewer/items

# Response
{
  "schedules": [
    {"id": "sched_001", "name": "Daily Report", "enabled": true, "next_run": "2026-05-22T09:00:00Z"}
  ],
  "listeners": [
    {"id": "listen_001", "name": "Slack Bot", "platform": "slack", "status": "online"}
  ],
  "webhooks": [
    {"id": "hook_001", "name": "GitHub Hook", "enabled": true}
  ]
}

# List all remote deployments from the master
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/remote-deployments

# Response
{
  "deployments": [
    {
      "id": "deploy_abc123",
      "hostname": "gpu-server-1.local",
      "mode": "full",
      "status": "online",
      "last_heartbeat": "2026-05-21T10:29:30Z",
      "uptime_seconds": 172800
    },
    {
      "id": "deploy_def456",
      "hostname": "inference-node-2.local",
      "mode": "inference-worker",
      "status": "online",
      "last_heartbeat": "2026-05-21T10:29:45Z",
      "uptime_seconds": 86400
    }
  ]
}

Slash Commands

MethodEndpointDescription
GET/api/slash-commandsList all custom slash commands
POST/api/slash-commandsCreate a new slash command
GET/api/slash-commands/{id}Get a specific slash command
PUT/api/slash-commands/{id}Update a slash command
DELETE/api/slash-commands/{id}Delete a slash command

Custom Providers

MethodEndpointDescription
GET/api/providers/customList custom providers
POST/api/providers/customSave (create or update) a custom provider
DELETE/api/providers/custom/{id}Delete a custom provider
# Add a custom provider
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "My vLLM Server",
    "base_url": "https://my-vllm.example.com/v1",
    "api_key": "my-api-key",
    "api_format": "openai",
    "auth_type": "bearer"
  }' \
  http://localhost:18800/api/providers/custom

# List custom providers
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/providers/custom

# Create a slash command
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "summarize",
    "description": "Summarize the current conversation or document",
    "prompt": "Summarize the following content in 3-5 bullet points: {{input}}",
    "handler_type": "agent_task"
  }' \
  http://localhost:18800/api/slash-commands

# List slash commands
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:18800/api/slash-commands

# Response
{
  "commands": [
    {
      "id": "cmd_abc123",
      "name": "summarize",
      "description": "Summarize the current conversation or document",
      "handler_type": "agent_task",
      "enabled": true
    }
  ]
}

Other Endpoints

MethodEndpointDescription
GET/api/environmentGet system environment information (OS, architecture, memory, etc.)
GET/api/diagnosticsRun system diagnostics and return results
GET/api/web-server/statusGet the web server's current status and configuration
GET/api/healthHealth check (always public, returns {"status":"ok"})
POST/api/heartbeat/incomingReceive heartbeat from remote deployment nodes (always public)
GET/api/image-proxyProxy external images (for PDF export and CORS bypass)
GET/api/dependencies/checkCheck status of system dependencies (Docker, Node.js, Python, etc.)
GET/api/design-systemsList available design system tokens and themes
GET/charts/{filename}Serve generated chart image files

WebSocket Endpoints

PathDescription
/ws/browserChrome extension browser automation commands and responses
/ws/sidepanelChrome extension side panel chat with real-time streaming

API Compatibility Layer

The aura-inference local inference server exposes multiple compatibility APIs, allowing it to act as a drop-in replacement for popular inference services.

OpenAI-Compatible API

MethodPathDescription
POST/v1/chat/completionsChat completion (streaming and non-streaming)
POST/v1/completionsText completion
POST/v1/embeddingsText embeddings
GET/v1/modelsList available models
# Chat completion (OpenAI format)
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "local-model",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": true
  }'

Anthropic-Compatible API

MethodPathDescription
POST/v1/messagesAnthropic Messages API format

Ollama-Compatible API

MethodPathDescription
POST/api/chatOllama chat format
POST/api/generateOllama generate format
GET/api/tagsList available models (Ollama format)
GET/api/psList running models
POST/api/showShow model information

API Gateway

Aura Workshop also exposes an API gateway that allows external tools to use it as a proxy for LLM requests. Point any OpenAI or Anthropic SDK at your Aura Workshop instance and it will route requests to whatever provider and model you have configured, applying spend limits, fallback, and logging automatically. This is particularly useful for:

MethodPathDescription
POST/v1/chat/completionsOpenAI-compatible chat completion (proxied to configured provider)
POST/v1/messagesAnthropic-compatible messages (proxied to configured provider)
GET/v1/modelsList available models from all configured providers

License

MethodEndpointDescription
POST/api/license/validateValidate and activate a license key
GET/api/license/statusGet the current license status and tier

aura-cli (Go TUI)

Aura Workshop includes a standalone Go binary CLI (aura-cli v1.0.0) that provides a terminal user interface built with Bubbletea. The CLI connects to an Aura Workshop server instance for inference.

Usage

aura-cli [flags]

Flags

FlagLong FormDescriptionDefault
-s--serverServer URL to connect tohttp://localhost:18800
-m--modelModel to use for inference(server default)
-t--tokenAuthentication token for the server(none)
--passwordPassword for authentication(none)
-p--projectProject path to attach as the working directory(current directory)
--chatChat mode: no tool access, conversation only(agent mode)
-v--verboseVerbose output for debuggingoff
--versionPrint version information and exit

Modes

Pipe Mode

The CLI supports piping input via stdin for non-interactive use. This is useful for scripting and CI/CD integration:

# Pipe a prompt directly
echo "Explain this error" | aura-cli -s http://my-server:18800 -t my-token

# Pipe a file as context
cat error.log | aura-cli -m claude-sonnet-4-20250514 -t my-token

# Use in shell scripts
git diff HEAD~1 | aura-cli --chat -m claude-sonnet-4-20250514 -t my-token \
  -s http://my-server:18800

In pipe mode, the CLI reads stdin until EOF, sends the content as the user message, and outputs the agent's response to stdout. This makes it easy to integrate Aura Workshop into existing command-line workflows and automation scripts.

Configuration File

Default settings can be stored in ~/.aura/config.json:

{
  "server": "http://localhost:18800",
  "token": "my-auth-token",
  "model": "claude-sonnet-4-20250514"
}

Examples

# Connect to a remote server with auth
aura-cli -s http://my-server:18800 -t my-secret-token

# Use a specific model in chat mode
aura-cli -m claude-sonnet-4-20250514 --chat

# Attach a project directory
aura-cli -p /path/to/my/project

aura-inference CLI

The aura-inference binary can be used standalone from the command line for local model serving.

Serve Mode

aura-inference serve --model <path-to-model.gguf> --port 8080 [options]
OptionDescriptionDefault
--modelPath to the GGUF model file(required)
--portHTTP server port8080
--gpu-layersNumber of layers to offload to GPU (-1 = all)-1
--quantizationQuantization type (Q4_K_M, Q5_K_M, Q8_0, F16)Q4_K_M
--ctx-sizeContext window size in tokens4096
--batch-sizeInference batch size512
--flash-attnEnable flash attentiondisabled
--cache-type-kKV cache key type (q8_0, f16)q8_0
--cache-type-vKV cache value type (q8_0, f16)q8_0
--thinkingEnable thinking/reasoning modedisabled
--rpc-workersComma-separated list of RPC workers (host:port)(none)
aura-inference serve \
  --model ~/.cache/aura-inference/models/qwen3-8b-q4_k_m.gguf \
  --port 8080 \
  --gpu-layers -1 \
  --ctx-size 8192 \
  --batch-size 1024 \
  --flash-attn \
  --thinking

RPC Worker Mode

aura-inference rpc --host 0.0.0.0 --port 50052

Starts the binary in RPC worker mode, contributing GPU resources to a master node. The master specifies workers via the --rpc-workers flag with a comma-separated list of host:port pairs:

# Master with two RPC workers
aura-inference serve \
  --model ~/.cache/aura-inference/models/llama-3.3-70b-q4_k_m.gguf \
  --port 8080 \
  --gpu-layers -1 \
  --rpc-workers worker1.local:50052,worker2.local:50052

Each worker contributes its GPU VRAM to the inference cluster. The master distributes model layers proportionally across all available GPUs (local plus remote workers). This enables running models that are too large for any single GPU.

aura-daemon CLI

The daemon binary is used inside Docker containers for headless operation.

aura-daemon --mode <mode>
ModeDescription
fullComplete daemon with all features enabled
inference-masterInference cluster master node
inference-workerInference cluster worker node
workerTask execution worker

The daemon reads all its configuration from environment variables. See Docker Daemon for the full environment variable reference.

Database Schema

Aura Workshop uses a single SQLite database in WAL (Write-Ahead Logging) mode with 35+ tables. The schema is created and migrated automatically on launch.

TablePurpose
settingsApplication configuration stored as key-value pairs
conversationsChat conversation metadata (title, timestamps)
messagesIndividual messages within conversations (role, content, timestamps)
tasksAgent task metadata: status, model, project path, role, timestamps, classification
task_messagesMessages within agent tasks including tool calls and results
task_memoryRole handoff data, compaction summaries, and checkpoint state
memory_factsLLM-extracted structured facts with category, confidence, scope, and content
scheduled_tasksSchedule definitions: type, time, cron expression, duration, enabled state
listenersListener configurations: platform, token, trigger type, rules, prompt
listener_eventsEvent logs for listener activity: timestamp, message, user, response
webhooksWebhook endpoint configurations: name, secret, prompt, model
webhook_logsInvocation logs for webhooks: timestamp, request body, response
workflowsDAG workflow definitions: nodes, edges, metadata
workflow_runsWorkflow execution state: per-step status, timestamps, data context
mcp_serversMCP server configurations: name, transport, command/URL, isolation mode
credentialsAES-256-GCM encrypted credential values with name and type metadata
credential_poolShared credential pools for team-level credential sharing
custom_providersUser-defined LLM provider endpoints: name, URL, API format, auth type
spend_limitsPer-provider monthly spend caps
token_usagePer-request token usage and cost tracking: model, provider, input/output tokens, cost
cloud_connectorsAWS S3, GCS, Azure Blob connector configurations
teamsMulti-agent team definitions: name, roles, workflow type
skill_settingsPer-skill configuration overrides
projectsProject definitions: name, path, metadata
pluginsPlugin configurations: name, version, enabled state, settings
slash_commandsCustom slash command definitions: name, description, prompt, handler type
design_systemsDesign system token definitions for design skills
remote_deploymentsRemote agent deployment records: host, mode, status, pairing code
task_filesFiles created or modified by agent tasks: path, operation, task ID
routing_logsSmart routing decision logs: task ID, tier, model, score
provider_healthProvider health tracking: latency, error count, circuit breaker state
ollama_modelsCached Ollama model metadata
aura_modelsLocal GGUF model metadata from scanning
model_pricingInput/output pricing per million tokens for each model
fallback_orderProvider fallback priority sequence

Network Ports Reference

PortProtocolServiceNotes
18800TCPWeb UI + REST APIMain application port. Serves the SolidJS frontend and all REST API endpoints.
18801UDPLAN inference discoveryBroadcast/listen for inference cluster nodes. 30-second interval, 90-second expiry.
18790TCPWebhook receiverIncoming webhook payloads from external services.
8080TCPAura AI local inferenceLocal GGUF model inference server. Configurable port.
11434TCPOllamaDefault Ollama server port. Auto-detected by Aura Workshop.
50052TCPRPC (distributed inference)Worker-to-master RPC for distributed model layer inference.
1420TCPVite dev serverDevelopment only. Used when running the frontend in development mode.

Data Storage Locations

DataLocation
Database (macOS)~/Library/Application Support/aura-workshop/aura-workshop.db
Database (Linux)~/.local/share/aura-workshop/aura-workshop.db
Database (Windows)%APPDATA%\aura-workshop\aura-workshop.db
Database (Docker)/data/aura-workshop.db
User memory~/.aura/memory/
Custom roles~/.aura/roles/
GGUF models~/.cache/aura-inference/models/
HuggingFace cache~/.cache/huggingface/hub/
Project memory.aura/memory/ (relative to project root)
Project instructionsAURA.md or CLAUDE.md (in project root)
CLI config~/.aura/config.json
SkillsBundled in app resources; custom skills in the database

License Tiers

TierFeatures
CommunityCore features: single-agent tasks, basic automation (schedules, listeners, webhooks), local inference via Aura AI and Ollama, all built-in tools, all skills, web UI access.
ProfessionalEverything in Community plus: multi-agent teams, advanced automation rules, all 31 listener platforms, smart routing, spend tracking and limits.
BusinessEverything in Professional plus: visual DAG workflow editor, remote agent deployment, inference cluster management, advanced merge strategies.
EnterpriseEverything in Business plus: credential pools, priority support, custom integrations, SLA guarantees.

The application works fully in Community mode without a license key. Enter a license key in Settings > Security to unlock higher tiers.

License Activation

To activate a license:

  1. Go to Settings > Security.
  2. Enter your license key in the License Key field.
  3. Click Activate. The system validates the key against the license server.
  4. On successful validation, the license tier and expiration date are displayed.
  5. Features for the new tier become available immediately without restart.

License validation can also be performed via the REST API: POST /api/license/validate with the key in the request body. Check current license status at any time with GET /api/license/status.

Feature Comparison

FeatureCommunityProfessionalBusinessEnterprise
Single-agent tasksYesYesYesYes
Built-in tools (50+)YesYesYesYes
All skills (100+)YesYesYesYes
Local inference (Aura AI)YesYesYesYes
Web UI accessYesYesYesYes
Basic automationYesYesYesYes
Multi-agent teams--YesYesYes
Smart routing--YesYesYes
All 31 listener platforms--YesYesYes
Spend tracking & limits--YesYesYes
Visual DAG workflows----YesYes
Remote deployment----YesYes
Inference cluster----YesYes
Credential pools------Yes
Priority support------Yes

Troubleshooting & FAQ

macOS quarantine prevents launch

After installing from DMG, macOS may quarantine the application. You will see a message like "Aura Workshop cannot be opened because it is from an unidentified developer." Open Terminal and run:

xattr -cr /Applications/Aura\ Workshop.app

Then launch the app again from Applications or Spotlight.

No models appear in the model selector

Task stuck in "Executing" state

Web UI not accessible from another machine

Context window errors or truncation

Docker mode not available

Local inference not starting

Inference cluster workers not discovered

MCP server connection failures

Agent blocked from running a command

Provider rate limit errors (HTTP 429)

Database corruption or issues

How do I use Aura Workshop without any internet connection?

Download one or more GGUF models via the HuggingFace Downloader while you have internet access. Then use the Aura AI local inference engine to run those models entirely on your hardware. No internet connection is required for local inference, and all data stays on your machine.

Listener not receiving messages from a platform

Webhook not triggering tasks

Schedule not running at the expected time

Voice input not working

Smart routing not saving money as expected

Memory facts not being extracted

AURA.md not being detected

How do I update to the latest version?

Go to Settings > Updates and click "Check for Updates". If a new version is available, click "Download and Install". The application downloads the update, verifies its integrity, and restarts automatically. For Docker deployments, pull the latest image: docker pull coolkoo/aura-workshop:daemon-latest.

Can I use Aura Workshop as an API server for other applications?

Yes. The API Gateway feature exposes OpenAI-compatible and Anthropic-compatible endpoints at /v1/chat/completions, /v1/messages, and /v1/models. Point any SDK or tool at your Aura Workshop instance and it will proxy requests to your configured provider with automatic fallback, spend limits, and logging.

How do I back up my Aura Workshop data?

All application data is stored in a single SQLite database file. To create a backup:

  1. Locate the database file for your platform (see Data Storage Locations).
  2. Copy the aura-workshop.db file to a safe location. Because the database uses WAL mode, also copy aura-workshop.db-wal and aura-workshop.db-shm if they exist.
  3. Optionally back up the ~/.aura/ directory to preserve user memories, custom roles, and CLI configuration.
  4. For Docker deployments, the /data volume mount contains the database.

How do I migrate Aura Workshop to a new machine?

  1. Install Aura Workshop on the new machine.
  2. Copy the database file from the old machine to the appropriate location on the new machine.
  3. Copy the ~/.aura/ directory for memories and roles.
  4. Copy the ~/.cache/aura-inference/models/ directory if you want to keep downloaded GGUF models.
  5. Launch Aura Workshop on the new machine. It will read the existing database and restore your configuration.
  6. Note: credential encryption keys are stored in the OS keychain and cannot be migrated. You will need to re-enter API keys and credentials on the new machine.

Can multiple users share one Aura Workshop instance?

Yes. The web UI can be accessed by multiple users simultaneously. Each user creates their own tasks, and all tasks are visible to all users. For multi-user environments, configure a Bearer token for authentication. Note that there is no per-user access control in the current version; all authenticated users have full access to all features and data.

What happens when the context window fills up?

When the conversation context reaches 70% of the model's maximum context window, automatic compression kicks in. The system summarizes older messages into a condensed form, preserving the most important information while freeing up context space. This process is transparent and the task continues without interruption. The context usage percentage is visible in the Context Panel on the right sidebar.

How do parallel agents handle file conflicts?

When multiple parallel agents modify the same file, the merge executor resolves conflicts using the configured strategy: CopyOnWrite (creates side-by-side copies, safest option), LastWins (last agent's version prevails), or LLMResolve (an LLM intelligently merges the changes). The merge also handles dependency files (package.json, Cargo.toml, requirements.txt) with smart dependency merging that combines packages from all agents.

Supported Providers Summary

ProviderAPI FormatAuth TypeFree Tier
AnthropicAnthropic nativeAPI key headerNo
OpenAIOpenAIBearer tokenNo
GoogleGoogle Gemini nativeQuery parameterYes (Flash, Flash-Lite)
MiniMaxOpenAI-compatibleBearer tokenNo
DeepSeekOpenAI-compatibleBearer tokenNo
Mistral AIOpenAI-compatibleBearer tokenNo
Zhipu AI (z.ai)OpenAI-compatibleBearer tokenYes
Xiaomi / MiMoOpenAI-compatibleBearer tokenNo
Moonshot / KimiOpenAI-compatibleBearer tokenNo
OpenRouterOpenAI-compatibleBearer tokenYes (many free models)
Together AIOpenAI-compatibleBearer tokenNo
GroqOpenAI-compatibleBearer tokenYes (rate-limited)
SiliconFlowOpenAI-compatibleBearer tokenNo
OllamaOpenAI-compatibleNoneFree (local)
LM StudioOpenAI-compatibleNoneFree (local)
LocalAIOpenAI-compatibleNoneFree (local)
vLLMOpenAI-compatibleNoneFree (local)
TGIOpenAI-compatibleNoneFree (local)
SGLangOpenAI-compatibleNoneFree (local)
Aura AIOpenAI-compatibleNoneFree (bundled)