Aura Workshop Usage Guide
Complete reference for setting up, configuring, and using Aura Workshop — from first launch to advanced multi-agent orchestration.
First Launch
When you open Aura Workshop for the first time, the app creates a local Aurix database using SQLite at:
- macOS:
~/Library/Application Support/aura-workshop/aura-workshop.db - Windows:
%APPDATA%\aura-workshop\aura-workshop.db
All settings, conversations, listener configurations, and scheduled tasks are stored in this database.
On first launch the app also:
- Detects whether Docker is available. If Docker is not installed or not running, the app enables native mode automatically, which runs commands directly on the host system.
- Installs bundled skills (PDF, DOCX, XLSX, PPTX, and various development workflow skills) to the skills directory.
- Initializes the credential encryption system, generating an AES-256-GCM key and storing it in the macOS Keychain.
- Starts the Web UI server on port 18800 (configurable), serving the full interface via HTTP for browser access.
- Auto-starts any listeners that were previously running.
API Key Setup
- Open Settings by clicking the gear icon in the sidebar or top toolbar.
- Select a provider tab (Anthropic, OpenAI, Google, Ollama, etc.).
- Enter your API key in the key field. The key is encrypted before being stored in the database.
- Select a model from the dropdown or type a custom model ID.
- The base URL is filled automatically based on the provider preset but can be overridden.
Each provider stores its API key independently. When you switch between providers, the app loads the previously saved key for that provider.
Local providers (Ollama, Aura AI, LocalAI, vLLM, TGI, SGLang) do not require an API key. Select one of these providers and ensure the corresponding inference server is running locally.
Web UI (Browser Access)
Aura Workshop includes an embedded axum HTTP server that serves the full SolidJS UI via any web browser. This enables headless Linux server deployment and remote access from any device on the network.
Accessing the Web UI
The Web UI server auto-starts on port 18800 by default. Open your browser and navigate to:
http://<machine-ip>:18800
The actual machine IP address is displayed in Settings > System Management > Web UI Server.
Configuration
Configure the Web UI server in Settings > System Management > Web UI Server:
| Setting | Description | Default |
|---|---|---|
| Enabled | Toggle the Web UI server on/off | On |
| Port | HTTP port for the web server | 18800 |
| Auth Token | Optional Bearer token for authentication | Empty (no auth) |
Settings are persisted in the database and survive app restarts.
Authentication
When an auth token is configured, all requests to the Web UI must include it:
- As an
Authorization: Bearer <token>header - Or via a login prompt in the browser
How It Works
The browser-based UI uses the same SolidJS frontend with a web transport layer instead of Tauri IPC:
- REST endpoints replace Tauri
invoke()calls for all commands - Server-Sent Events (SSE) replace Tauri event listeners for real-time streaming
- Lazy Tauri imports -- the frontend detects whether it is running inside Tauri or a browser and loads the appropriate transport
All features work identically in browser mode: agent tasks, chat, settings, listeners, webhooks, schedules, skills, MCP servers, and teams.
Headless Server Usage
For headless Linux servers without a desktop environment:
- Install the
.debor.AppImagepackage - Launch with a virtual display:
xvfb-run aura-workshop - Access the full UI at
http://<server-ip>:18800 - Configure API keys, models, and all settings through the browser
Creating and Running Tasks
Starting a New Task
- Type your request in the input field at the bottom of the main panel.
- Press Enter or click the send button.
- The agent creates a plan, then executes it step by step using available tools.
- Progress is shown in real time: you see the agent's text output, tool invocations, and results as they happen.
How Task Classification Works
When you send a prompt, Aura's classification system analyzes it and routes to the right execution path:
| Classification | What Happens | Example |
|---|---|---|
| SINGLE | One agent handles it directly | Simple questions, code scripts, file edits |
| CLARIFY | Asks clarifying questions, pauses for your answer | Vague requests like "help me with my project" |
| TEAM:N | Uses an existing team's workflow | Routes to Software Dev Team, Content Writing Team, etc. |
| WORKFLOW:N | Reuses a previously saved workflow | Same prompt pattern as a prior task |
| NEW | Creates a new team with specialized roles | Complex tasks needing multiple specialists |
You don't need to pick a team or workflow manually -- just describe what you want and the system figures out the best approach.
Sample Prompts (Validated Test Scenarios)
These prompts have been tested end-to-end and demonstrate Aura's core capabilities.
1. Simple Question (SINGLE)
What are the three laws of thermodynamics? One sentence each.
What happens: Single agent answers directly in 1 turn. No tools, no team. Status: completed.
2. Code Generation (SINGLE)
Write a Python function called is_palindrome that checks if a string reads the same forwards and backwards. Save it to palindrome.py
What happens: Single agent writes the code, saves the file, optionally runs tests. You'll see write_file and bash tool calls in the sidebar.
3. Clarification (CLARIFY)
Help me with my project
What happens: The system detects the request is too vague. Instead of guessing, it asks 3-5 specific clarifying questions. Task status shows "Needs Response" in amber. Reply with details and the system re-classifies your answer -- if it needs a team, it creates one automatically.
4. Team Workflow -- Software Development (TEAM)
Build a full REST API for a todo list app with CRUD endpoints, database schema, and unit tests
What happens: Routes to the Software Dev Team (5 roles: Product Manager, Architect, Developer, QA Engineer, DevOps). The PM may ask clarifying questions -- the workflow pauses until you answer. The Workflow Progress panel shows each role's status. Click any node to see that agent's specific output and tool calls.
5. Team Workflow -- Content Writing (TEAM)
Research and write a 2000 word blog post about the future of AI agents in enterprise software
What happens: Routes to the Content Writing Team (3 roles: Research Lead, Writer, Editor). Each role produces its deliverable sequentially. The final blog post is saved as a markdown file.
6. Scheduled Task (SCHEDULE)
Every Monday at 9am, compile a summary of all git commits from the past week and email it to [email protected]
What happens: Classification detects the scheduling intent. A schedule is created and appears in the SCHEDULED section of the sidebar. The agent runs the task immediately as a first execution. The schedule fires automatically at the configured time.
7. New Team Creation (NEW)
Design and build a data ingestion pipeline that reads CSV files, validates the data, transforms it, and loads it into PostgreSQL
What happens: No existing team matches, so the classification creates a new pipeline team (e.g., Data Architect, Developer, DevOps Engineer) in one shot. The team is saved for future reuse. A workflow runs with all agents producing files in your mounted folder.
8. Workflow Reuse (TEAM/WORKFLOW)
Build a REST API for a bookmark manager with full CRUD and tests
What happens: The system recognizes this is similar to Test 4 and reuses the Software Dev Team. A new workflow is saved for this specific task. No duplicate teams created.
9. Translation with Parallel Roles (NEW)
Translate this product documentation into Spanish, French, and German simultaneously, then have each reviewed by a native speaker
What happens: Creates a Localization Team with parallel workflow -- translators run simultaneously, then reviewers check each language. The Workflow Progress panel shows parallel nodes running at the same time.
10. Newsletter + Schedule (NEW + SCHEDULE)
Every Friday at 5pm, research the top AI news, write a newsletter with 5 sections, and email it to [email protected]
What happens: Creates a new newsletter team AND a recurring schedule. Both appear in the sidebar immediately after the task completes. The first newsletter is generated right away.
11. Listener + Automation (NEW + LISTENER)
When a customer asks about pricing on WhatsApp, look up their account, generate a personalized quote, and reply plus send a follow-up email to sales
What happens: Creates a Customer Support team, a listener (WhatsApp, disabled until auth is configured), and a workflow. The listener appears in the LISTENERS section. WhatsApp listeners start disabled because they require authentication setup.
Chat Mode
In addition to the full Agent mode, Aura Workshop provides a lightweight Chat mode for quick questions and conversations that do not require agent tooling.
Switching Modes
Toggle between Agent and Chat tabs in the sidebar. Agent mode gives you the full tool-calling agent loop; Chat mode gives you a fast, conversational interface.
Features
- Markdown rendering -- Responses are rendered with full markdown support including syntax-highlighted code blocks, tables, and one-click copy buttons on code snippets.
- Model picker -- The model selector dropdown works the same way as in Agent mode. Any configured provider and model can be used.
- Streaming responses -- Text streams in token-by-token with a blinking cursor while the model generates.
- Tools toggle -- An optional tools toggle lets you enable chat-with-tools, giving the Chat mode access to the same tools the agent uses without the full agentic loop.
- Conversation history -- Create new chat conversations, select previous ones from the sidebar, or delete conversations you no longer need.
How the Agent Works
The agent operates in a loop of up to 50 turns (configurable). Each turn:
- Sends the conversation history and available tools to the LLM.
- Receives a response that may contain text and/or tool calls.
- Executes any tool calls (file reads, writes, bash commands, etc.).
- Feeds tool results back to the LLM for the next turn.
The loop ends when the LLM responds with only text (no tool calls), or the maximum turn count is reached.
Available Tools
In native mode (default when Docker is unavailable):
| Tool | Description |
|---|---|
read_file | Read file contents |
write_file | Create or overwrite a file |
edit_file | Make targeted edits to a file |
bash | Execute shell commands on the host |
glob | Find files by pattern |
grep | Search file contents with regex |
list_dir | List directory contents |
web_fetch | Fetch a web page and return clean text |
In Docker mode, three additional tools are available:
| Tool | Description |
|---|---|
docker_run | Run commands in Docker containers |
docker_list | List running containers |
docker_images | List available images |
Device tools (opt-in via Settings):
| Tool | Description |
|---|---|
system_notify | Send a system notification |
screen_capture | Capture a screenshot (macOS) |
camera_capture | Take a photo via webcam (macOS, requires imagesnap) |
Chat Commands
Type these slash commands in the message input to control the agent session without sending a message to the LLM.
| Command | Description |
|---|---|
/status | Show current model, provider, execution mode, and thinking level |
/new or /reset | Clear the conversation and start fresh |
/compact | Summarize the conversation history to reduce token usage. The agent compresses all previous messages into a summary and keeps only the last user message. |
/think <level> | Set the thinking/reasoning level. Valid levels: off, low, medium, high. Without an argument, shows the current level. |
/usage | Show token usage for the current session: input tokens, output tokens, total, and estimated cost |
/tools | List all available tools, both native and MCP |
/help | Show the help table of all available commands |
Mounted Folders
The mount folder button (folder icon next to Send) tells the agent where to work on your filesystem. Always mount a folder when you expect the agent to create or modify files.
How to Use
- Click the folder icon in the input toolbar.
- A native file dialog opens. Select one or more directories.
- Selected folders appear as a "MOUNTED FOLDERS" badge above the input field.
- When you send a message, the selected paths are passed as the
project_pathto the agent. - In Docker mode, these paths are bind-mounted into the container at
/workspace. - In native mode, the first path is used as the working directory for bash commands.
When to Mount
| Scenario | Mount folder? |
|---|---|
| "Build me a new project" | Yes -- mount where you want the project created (e.g., ~/Desktop/test) |
| "Read this codebase and refactor it" | Yes -- mount the project root |
| "What's the capital of France?" | No -- no file access needed |
| Running a team task from the dropdown | Yes -- mount the output directory so all roles write there |
| Agent creating + running a team | Yes -- the agent passes the mounted path to platform_run_team_task |
Important Notes
- Always mount a folder when you expect file creation or modification -- otherwise the agent may write to a random location or your home directory.
- You can mount multiple folders -- useful when a task needs to reference one project and write to another.
- Paths are per-session -- they reset when you start a new task. You need to re-mount when starting fresh.
- For team tasks, every role in the workflow inherits the same mounted path. All roles (PM, Developer, Reviewer, etc.) read from and write to the same directory.
- You can remove individual paths by clicking the remove button next to each one.
- Duplicates are automatically filtered out.
- The agent can only read and write files within the mounted folders and the user's home directory. Paths outside these areas may be inaccessible depending on OS permissions.
How Aura Workshop Orchestrates Work
Aura Workshop has a built-in AI orchestration engine. Instead of doing everything as a single agent, the system can automatically route complex tasks to multi-agent teams, create automation workflows, set up scheduled tasks, and wire triggers -- all from natural language prompts.
How It Works
When you send a prompt, the system decides the best approach:
| What you ask | What happens | Example |
|---|---|---|
| Simple one-off task | Single agent handles it directly | "What time is it in Tokyo?" |
| Complex multi-step project | Auto-routes to a multi-agent team | "Build me a REST API for a todo app" |
| Recurring task | Agent creates a scheduled task | "Every morning at 8am, check if my website is up" |
| Event-driven automation | Agent creates a workflow with triggers | "When a GitHub webhook fires, run tests and deploy" |
| Messaging automation | Agent creates a listener | "Set up a WhatsApp bot that answers pricing questions" |
Auto-Routing
For complex tasks, the system automatically detects the best team:
- Prompts containing "build", "create", "develop", "app", "api", etc. auto-route to the Software Dev Team
- Prompts containing "write", "blog", "article", "report", etc. auto-route to the Content Writing Team
- Simple prompts (< 5 words) or questions stay as single-agent tasks
This happens at the application level -- no model cooperation required.
Model Recommendations
Orchestration features (creating teams, workflows, schedules via natural language) work best with capable cloud models:
| Model | Single Agent | Team Execution | Workflow Creation | Orchestration |
|---|---|---|---|---|
| Claude (Anthropic) | Excellent | Excellent | Excellent | Excellent |
| GPT-4 (OpenAI) | Excellent | Excellent | Good | Good |
| DeepSeek Chat | Good | Good | Limited | Limited |
| Ollama (local) | Good | Good | Poor | Poor |
Local models work well for single-agent tasks and executing pre-configured teams. For creating new workflows and orchestrating complex automations via natural language, use a cloud model.
Multi-Agent Teams
Teams define multiple AI roles that work together. Each role runs as a separate agent with its own prompt, and the workflow engine manages execution order, parallel processing, and data passing between roles.
Default Teams
Software Dev Team (5 roles, fan-out enabled):
- Product Manager → Architect → Developer (fan-out) → QA Engineer → DevOps
- The Developer role uses fan-out: the Architect produces a task list, and one developer agent spawns per task in parallel
Content Writing Team (3 roles, sequential):
- Research Lead → Writer → Editor
Using Teams
Automatic -- just describe what you need. Complex tasks auto-route to the matching team:
Manual via Settings -- create or edit teams in Settings > Teams with the visual workflow editor.
Via natural language -- ask the agent to create a team:
Creating and Editing Teams
- Open Settings > Teams and click Create Team.
- Add roles -- each role needs a name and a system prompt.
- Choose a workflow type: Sequential or Pipeline (with validation gates).
- Use the Workflow Editor to customize: add Script, Webhook, Validate, or Approval Gate steps between roles.
- Enable Fan-Out on any role by double-clicking the node and checking "Enable Fan-Out".
- Import/Export -- click Import to load a team JSON, or Export on any team to download it.
Workflow Pause & Resume
When the first agent in a team workflow (typically the PM) asks clarifying questions, the workflow pauses automatically:
- The PM asks questions (e.g., "What database do you prefer?")
- Task status changes to "Needs Response"
- You type your answer in the input box
- The workflow resumes from where it paused, feeding your answer to the next agents
This ensures you get exactly the product you want instead of the agent guessing.
Fan-Out (Parallel Agents)
Fan-Out lets a role automatically spawn multiple agents in parallel -- one per item from an upstream role's output list.
How to enable: Double-click a role node in the Workflow Editor → check "Enable Fan-Out" → set Source Node and Max Parallel Agents.
How it works: The source role produces a numbered list. The fan-out executor detects the list, splits it, and spawns one agent per item. Results are merged for the next role.
| Team Type | Source Role Produces | Fan-Out Role Does |
|---|---|---|
| Software Dev | Architect lists tasks | One developer per task |
| Content Writing | Research Lead lists sections | One writer per section |
| Research | Lead lists questions | One researcher per question |
| Translation | Manager lists languages | One translator per language |
Workflow Progress
When a team runs, the right panel shows:
- All roles with status icons (pending, running, completed, failed)
- Fan-out sub-agents shown as indented items under the parent role
- Click any role to view only that role's output in the main panel
- Click "Show All" to return to the combined view
- Progress bar with completion count
Role Guardrails
Every agent in a team workflow automatically receives system-enforced rules:
- First role (entry node): CAN ask clarifying questions, but must ask ONLY questions and stop -- no deliverable alongside questions
- Middle/downstream roles: Cannot ask questions -- must work with what the previous role provided
- All roles: Cannot do other roles' jobs, must stop after delivering their files, cannot loop or re-check their own work
These guardrails are injected at the executor level and apply to every team, including user-created ones.
Automation Workflows
Automation workflows are pipelines that orchestrate triggers, conditions, scripts, webhooks, and teams. Unlike teams (which are multi-agent role-based), workflows handle the plumbing: when to run, what data to route, which conditions to check.
Creating Workflows
Via natural language -- describe the automation you need:
The agent creates the workflow using platform_create_workflow, connecting script nodes, conditional nodes, team nodes, and webhook nodes.
Via Settings -- go to Settings > Workflows > Create Workflow. Use the visual editor to add nodes, connect them, and configure each one.
Via Import -- click Import on the Workflows tab to load a workflow JSON file.
Available Node Types
| Node | Type | Description |
|---|---|---|
| Agent Task | agent-task | LLM agent with tools |
| Team | team | Runs a saved multi-agent team as a step |
| Script | script | Runs bash, Python, Node.js, or Go code |
| Webhook | webhook | HTTP request (GET/POST/PUT/PATCH/DELETE) |
| Conditional | conditional | IF/ELSE branching based on expressions |
| Transform | transform | Data manipulation via JS/Python expression |
| Fan-Out | fan-out | Splits a list into parallel executions |
| Merge | merge | Combines results from parallel branches |
| Delay | delay | Waits a specified duration before continuing |
| Validate | validate | LLM quality check on a previous node's output |
| Approval Gate | human-in-the-loop | Pauses for human approval |
Conditional Expressions
The conditional node evaluates expressions against workflow data:
node_id.field == "value"-- equalitynode_id.field != "value"-- inequalitynode_id.field > 100-- numeric comparisonnode_id.field contains "text"-- substring checknode_id.field exists-- field is not nullnode_id.field is_empty-- field is null or empty
Routes to true or false output ports based on the result.
Connecting to Triggers
Scheduled execution -- use platform_create_schedule with prompt conventions:
- Regular agent: set the prompt normally
- Team: set prompt to
[team:TEAM_ID] Your message here - Workflow: set prompt to
[workflow:WORKFLOW_ID]
Webhook-triggered -- create a workflow with a webhook node as the entry point.
Listener-triggered -- use platform_create_listener for messaging platforms.
Workflow Templates
Pre-built templates are available in the templates/ directory:
| Template | Fan-Out | Use Case |
|---|---|---|
| Marketing Campaign | Copywriter per channel | Multi-channel campaigns |
| Competitive Analysis | Researcher per competitor | Market research |
| Course Creator | Lesson Writer per lesson | Educational content |
| Data Analysis | Collector per source | Analytics and reporting |
| Translation | Translator per language | Localization |
| Code Migration | Migrator per module | Codebase conversion |
| Proposal Writer | Section Writer per section | RFP responses |
Import any template via Settings > Workflows > Import.
Platform Tools Reference (53 tools)
The agent has full CRUD operations for all platform resources:
Listeners (8 tools)
| Tool | Description |
|---|---|
platform_create_listener | Create a messaging listener |
platform_list_listeners | List all listeners |
platform_get_listener | Get listener details and status |
platform_edit_listener | Edit a listener's configuration |
platform_delete_listener | Delete a listener |
platform_start_listener | Start a listener |
platform_stop_listener | Stop a listener |
platform_get_listener_status | Get listener status and QR data |
Webhooks (7 tools)
| Tool | Description |
|---|---|
platform_create_webhook | Create a webhook endpoint |
platform_list_webhooks | List all webhooks |
platform_get_webhook | Get webhook details |
platform_edit_webhook | Edit a webhook's configuration |
platform_delete_webhook | Delete a webhook |
platform_start_webhook | Start a webhook |
platform_stop_webhook | Stop a webhook |
Schedules (7 tools)
| Tool | Description |
|---|---|
platform_create_schedule | Create a scheduled task |
platform_list_schedules | List all scheduled tasks |
platform_get_schedule | Get schedule details |
platform_edit_schedule | Edit a scheduled task |
platform_delete_schedule | Delete a scheduled task |
platform_start_schedule | Start a schedule |
platform_stop_schedule | Stop a schedule |
Skills (5 tools)
| Tool | Description |
|---|---|
platform_create_skill | Create a new skill |
platform_list_skills | List installed skills |
platform_get_skill | Get skill details |
platform_edit_skill | Edit a skill's SKILL.md content |
platform_delete_skill | Delete a skill |
MCP Servers (5 tools)
| Tool | Description |
|---|---|
platform_connect_mcp | Connect to an MCP server |
platform_list_mcp | List connected MCP servers |
platform_get_mcp | Get MCP server details |
platform_edit_mcp | Edit an MCP server configuration |
platform_disconnect_mcp | Disconnect an MCP server |
Teams (6 tools)
| Tool | Description |
|---|---|
platform_create_team | Create a multi-agent team with roles and optional fan-out |
platform_list_teams | List all teams |
platform_get_team | Get team details |
platform_edit_team | Edit a team's configuration |
platform_delete_team | Delete a team |
platform_run_team_task | Execute a team's workflow (blocks until complete, returns results) |
Automation Workflows (4 tools)
| Tool | Description |
|---|---|
platform_create_workflow | Create an automation workflow with nodes, edges, and triggers |
platform_list_workflows | List all automation workflows |
platform_run_workflow | Execute a workflow (blocks until complete) |
platform_delete_workflow | Delete a workflow |
Credentials (5 tools)
| Tool | Description |
|---|---|
platform_store_credential | Store a credential securely |
platform_list_credentials | List stored credential names and types (no secrets) |
platform_get_credential | Retrieve a decrypted credential by name |
platform_edit_credential | Edit a stored credential |
platform_delete_credential | Delete a credential |
Settings (2 tools)
| Tool | Description |
|---|---|
platform_get_settings | View current settings (no API keys) |
platform_update_settings | Update settings |
Security Notes
platform_get_settingsnever returns API keys -- only model, provider, base URL, and non-sensitive configuration.platform_get_credentialreturns decrypted values to the agent's internal context but the credential-store skill instructs the agent to never display them in chat. Values are used only in tool calls (bash environment variables, config files, etc.).platform_store_credentialencrypts values with AES-256-GCM before writing to the database. The encryption key is stored in the system keychain (macOS Keychain, Windows Credential Manager).
Skills System
Skills are structured instruction sets that guide the agent when performing specific types of tasks.
How Skills Work
- Skills are stored in the skills directory:
~/Library/Application Support/aura-workshop/skills/(macOS). - Each skill is a folder containing a
SKILL.mdfile with YAML frontmatter (name, description) and markdown instructions. - When the agent starts, all available skills are listed in the system prompt.
- When a user request matches a skill, the agent reads the skill's
SKILL.mdfile and follows its instructions.
Bundled Skills
Document and creative skills:
- pdf, docx, xlsx, pptx -- Office document processing
- algorithmic-art, canvas-design, frontend-design -- Visual design
- brand-guidelines, internal-comms, doc-coauthoring -- Business documents
- mcp-builder, skill-creator, slack-gif-creator -- Developer tools
- theme-factory, web-artifacts-builder, webapp-testing -- Web development
Development workflow skills (superpowers):
- brainstorming, writing-plans, executing-plans
- test-driven-development, systematic-debugging
- requesting-code-review, receiving-code-review
- finishing-a-development-branch, using-git-worktrees
- dispatching-parallel-agents, subagent-driven-development
- verification-before-completion, using-superpowers, writing-skills
Platform integration:
- credential-store -- teaches the agent to detect, store, and retrieve credentials securely
Editing Skills
Each skill in Settings > Skills has an Edit button. Clicking it opens a modal editor for the skill's SKILL.md file, where you can modify the YAML frontmatter (name, description) and the markdown instructions. Changes take effect on the next agent task.
The agent can also edit skills programmatically using the platform_edit_skill tool.
Adding Custom Skills
- Click "+ Add Skill" in the Skills panel to import a skill folder from your computer.
- Alternatively, create a folder in the skills directory with a
SKILL.mdfile. - The
SKILL.mdmust have YAML frontmatter withnameanddescriptionfields.
Using MCP Servers
The Model Context Protocol (MCP) allows Aura Workshop to connect to external tool servers, extending the agent's capabilities beyond built-in tools.
Adding an MCP Server
- Open Settings and navigate to the MCP section.
- Click Add Server.
- Choose a transport type:
- HTTP: provide the server URL (e.g.,
http://localhost:3000/mcp). - stdio: provide the command and arguments to spawn the server process (e.g., command
npx, args@playwright/mcp).
- HTTP: provide the server URL (e.g.,
- Optionally configure OAuth credentials if the server requires authentication.
- Toggle the server to enabled.
- The app connects and discovers available tools. These tools appear in the
/toolslisting with anmcp_prefix.
Playwright Browser Automation
Playwright MCP is a popular stdio-based MCP server for browser automation.
- Add an MCP server with:
- Transport:
stdio - Command:
npx - Args:
@playwright/mcp
- Transport:
- Enable the server. The agent can now browse the web, interact with pages, take screenshots, and extract content.
- MCP tool results are truncated to 8000 characters to prevent context overflow with large accessibility trees.
Browser-Use MCP (Bundled)
Browser-Use is bundled as a backup browser automation MCP server alongside Playwright. It is auto-connected on startup and provides an alternative approach to web interaction using a Python-based browser agent.
- No setup required -- the Browser-Use Python virtual environment and all dependencies (uv/uvx) are bundled with the installer.
- Browser-Use tools appear automatically in the
/toolslisting alongside Playwright tools. - The agent can use either Playwright or Browser-Use depending on the task requirements.
Custom MCP Servers
Any server implementing the MCP protocol can be added. Servers can expose tools with custom schemas, and the agent will see them as additional callable tools. Tool names are formatted as mcp_{server_id}_{tool_name} with hyphens and colons replaced by underscores.
Cloud Storage
Aura Workshop can connect to popular cloud storage services, allowing agents to read from and write to your cloud files as part of any task.
Supported Providers
- Dropbox
- Box
- OneDrive
- Google Drive
Connecting a Provider
- Open Settings and navigate to the Cloud Storage tab.
- Click Connect next to the provider you want to add.
- An OAuth2 authorization flow opens in your default browser. Approve access when prompted.
- A local OAuth callback server on port 18793 receives the authorization token and completes the connection.
- The provider appears as connected in the Cloud Storage settings.
Agent Access
Once a provider is connected, agents can read files from and write files to that cloud storage during tasks. For example, you can ask the agent to "download the quarterly report from my Google Drive and summarize it" or "upload this PDF to my Dropbox."
Access to cloud storage is gated behind biometric authentication (Touch ID on macOS, Windows Hello on Windows) to prevent unauthorized use. See the Biometric Authentication section below for details.
Chrome Extension
The Aura Workshop Chrome Extension brings your AI agent into any browser tab. Chat with your agent, ask about the page you're viewing, summarize content, or use selected text -- all from a side panel.
Prerequisites
- Aura Workshop desktop app installed and running
- A WebChat listener created and started (Settings > Listeners > + > WebChat)
- The WebChat listener's "Agent Tools" toggle set to your preference (on = full agent, off = chat only)
Installation
- Download
aura-workshop-chrome-extension.zipfrom the release page - Unzip the file to a permanent location (e.g.,
~/aura-chrome-extension/) - Open Chrome and navigate to
chrome://extensions - Enable Developer mode (toggle in the top-right corner)
- Click Load unpacked
- Select the unzipped extension folder
- The Aura Workshop icon appears in your Chrome toolbar
Usage
Click the extension icon to open the side panel. The panel connects to your running Aura Workshop instance via WebSocket on the WebChat listener port (default 18792).
Keyboard shortcut: Press Ctrl+Shift+Y (Windows/Linux) or Cmd+Shift+Y (macOS) to quickly toggle the side panel open or closed.
Status indicator: The header shows a green dot when connected, red when disconnected. The current model name is displayed next to the status dot.
Chat: Type a message and press Enter or click Send. The agent processes your request using all available tools (file operations, web fetch, bash commands, MCP tools, credentials) and returns the response.
Tab-aware conversations: Each browser tab maintains its own conversation context. Switching tabs automatically switches to that tab's conversation, so you can have independent chats running on different pages.
Quick Actions
| Button | What it does |
|---|---|
| Ask about page | Extracts the current page's title, URL, and text content (up to 3000 chars), sends it to the agent with a prompt to help understand the page |
| Summarize | Sends the page content with a summarization prompt |
| Use selection | Sends your highlighted text selection from the page to the agent |
| Stop | Stops the current response generation mid-stream |
| Task | Converts the current chat conversation into a full agent task in the desktop app |
| New chat | Clears the conversation and starts fresh |
File attachments: You can drag and drop files onto the input area or paste screenshots. Attached files are shown as thumbnails and sent to the agent as base64 data.
Troubleshooting
| Issue | Solution |
|---|---|
| Red status dot / "Disconnected" | Make sure Aura Workshop is running and the WebChat listener is started |
| "Connection failed" | Check that the WebChat listener port (default 18792) isn't blocked by a firewall |
| Agent not responding | Verify your model and API key are configured in Aura Workshop Settings |
| Quick actions don't extract content | Some pages (PDFs, iframes, cross-origin) may block content extraction -- use copy-paste instead |
Privacy
The extension communicates only with your local Aura Workshop instance (localhost:18792). No data is sent to external servers by the extension itself. Page content is only extracted when you click a quick action button -- it is not automatically collected.
Embeddable Chat Widget
Aura Workshop provides an embeddable chat widget that you can add to any website, enabling visitors to interact with your configured agent directly from a web page.
Setup
- Create a new Listener in Settings with the platform type set to WebChat.
- Start the listener. It launches a WebSocket server for real-time communication.
- Copy the provided JavaScript snippet from the listener configuration panel.
- Paste the snippet into your website's HTML.
Features
- WebSocket-based -- Real-time, bidirectional communication between the widget and Aura Workshop.
- Configurable appearance -- Customize colors, position, title text, and other visual properties to match your website's design.
- Agent tools support -- The widget can optionally use the full agent toolset, controlled by the "Agent Tools" toggle on the listener.
Embedding
The JavaScript snippet creates a floating chat button on your page. When clicked, it opens a chat panel that connects to your running Aura Workshop instance. The snippet handles the WebSocket connection, message rendering, and UI automatically.
Custom Provider Manager
The Custom Provider Manager lets you save and quickly switch between custom LLM provider configurations.
Adding a Custom Provider
- Open Settings and navigate to the provider configuration area.
- Click Add Custom Provider (or use the custom provider manager UI).
- Fill in the configuration:
- Name -- A display name for this provider (e.g., "My vLLM Server").
- Base URL -- The endpoint URL (e.g.,
http://192.168.1.100:8000/v1). - Model ID -- The model identifier the server expects.
- API Key -- Optional, depending on the server's auth requirements.
- Save the configuration.
Quick Switching
Saved custom providers appear in the model selector dropdown alongside built-in providers. Select one to instantly switch to that provider's configuration -- no need to re-enter URLs or keys each time.
Compatibility
The Custom Provider Manager supports any OpenAI-compatible endpoint. This includes vLLM, TGI, SGLang, LocalAI, LiteLLM, and any other server that implements the OpenAI chat completions API.
Billing & Spend Tracking
Usage Dashboard
Open Settings > Billing to see:
- Usage Summary -- Table showing today/month/all-time cost per provider with token counts
- Expenses Chart -- Full-width bar chart of daily spend (amber bars)
- Per-Model Charts -- For each model used: API requests area chart and tokens area chart
- Reset Usage button to clear all tracking data
Spend Limits
Set daily and monthly spending caps per configured provider:
- Go to Settings > Billing > Spend Limits
- Enter a daily limit (e.g., $5.00) and/or monthly limit (e.g., $50.00)
- When a provider hits its limit, Aura automatically switches to the next available provider in your fallback list
Provider Fallback Order
Configure a priority-ordered list of backup providers:
- Go to Settings > Billing > Provider Fallback Order
- Add providers in priority order (e.g., DeepSeek first, then Anthropic, then Ollama as local fallback)
- When the primary provider hits its spend limit, Aura switches mid-session with a notification
- Local models (Aura AI, Ollama) serve as zero-cost final fallbacks
Model Pricing
Edit the per-model pricing used for cost calculations:
- Go to Settings > Billing > Model Pricing
- Adjust input/output prices per million tokens for any model
- Prices are used to calculate your spend -- keep them updated with your provider's current rates
How Token Tracking Works
- Cloud APIs with exact counts: Anthropic and non-streaming responses include token usage in the API response
- Streaming APIs (estimated): DeepSeek, OpenAI streaming, and other SSE-based APIs don't return exact token counts. Aura estimates from text length (~3.5 characters per token)
- Chat messages: Estimated from conversation length plus system prompt overhead
- All estimates are slightly conservative (over-count rather than under-count)
Settings Overview
Settings now opens as a full-window panel, hiding the sidebar and task panel to give you an uncluttered view of all configuration options. Click the back arrow or press Escape to return to the main workspace.
| Setting | Description | Default |
|---|---|---|
| API Key | Provider-specific API key (encrypted in database) | Empty |
| Model | LLM model identifier | (none -- must be selected) |
| Base URL | API endpoint URL | (depends on provider) |
| Max Tokens | Maximum output tokens per response | 4096 |
| Temperature | Response randomness (0.0 - 1.0) | 0.7 |
| Top P | Nucleus sampling threshold (0.0 - 1.0) | 1.0 |
| Top K | Top-K sampling limit (0 = disabled) | 0 |
| Min P | Minimum probability threshold (0.0 - 1.0) | 0.0 |
| Repeat Penalty | Penalty for token repetition (1.0 = no penalty) | 1.0 |
| Native Mode | Run commands on host instead of Docker | Auto-detected |
| Thinking Level | Reasoning depth: off, low, medium, high | off |
| Elevated Bash | Allow elevated/sudo commands | false |
| Screen Capture | Enable the screen_capture tool | false |
| Camera Capture | Enable the camera_capture tool | false |
| System Notifications | Enable the system_notify tool | true |
| Voice Enabled | Enable text-to-speech for responses | false |
| TTS Provider | Voice synthesis: system, openai, elevenlabs | system |
| Selected Voice | Voice name/ID for TTS | (system default) |
| Custom Providers | Saved custom LLM provider configurations | (none) |
| Cloud Storage | Connected cloud storage providers (Dropbox, Box, OneDrive, Google Drive) | (none) |
| Biometric Auth | Require biometric authentication for sensitive settings | Enabled |
Sampling Parameters
- Temperature: Controls randomness. Lower values (closer to 0.0) produce more deterministic output; higher values (closer to 1.0) increase creativity.
- Top P: Nucleus sampling. The model considers only tokens whose cumulative probability reaches this threshold. Lower values narrow the output distribution.
- Top K: Limits the model to considering only the top K most probable tokens. Set to 0 to disable.
- Min P: Filters out tokens with probability below this threshold relative to the most likely token.
- Repeat Penalty: Penalizes tokens that have already appeared. Values above 1.0 discourage repetition; 1.0 applies no penalty.
Not all parameters are supported by every provider. Unsupported parameters are silently ignored by the backend.
Provider-specific keys are stored in a provider_keys JSON map, allowing you to have API keys saved for multiple providers simultaneously. OpenAI users can optionally configure Organization ID and Project ID.
Biometric Authentication
How It Works
- macOS: Touch ID
- Windows: Windows Hello
Biometric authentication is required to access the Credentials and Cloud Storage settings tabs. When you navigate to either of these tabs, a biometric prompt appears. Once you authenticate successfully, access is granted for the remainder of the session -- you do not need to authenticate again until you restart the app.
The experience is transparent and lightweight: a single Touch ID or Windows Hello prompt is all that stands between you and the protected settings.
Aura Workshop REST API
Aura Workshop includes an embedded HTTP server (default port 18800) that exposes the full platform as a REST API. Every feature available in the desktop app is also available via HTTP. All endpoints are prefixed with /api.
http://localhost:18800/apiContent-Type:
application/json for all POST/PUT requestsAuthentication: Optional Bearer token (configured in Settings > Web Server)
Tasks
| Method | Endpoint | Description |
|---|---|---|
GET | /api/tasks | List all tasks |
POST | /api/tasks | Create a new task |
GET | /api/tasks/{id} | Get task details |
DELETE | /api/tasks/{id} | Delete a task |
GET | /api/tasks/{id}/messages | Get task conversation messages |
GET | /api/tasks/interrupted | List interrupted tasks for resume |
POST | /api/tasks/{id}/run | Run a task agent (SSE stream) |
POST | /api/tasks/{id}/resume | Resume an interrupted task (SSE stream) |
Example: Create and run a task
# Create task
curl -X POST http://localhost:18800/api/tasks \
-H "Content-Type: application/json" \
-d '{"title":"Build API","description":"Build a REST API","prompt":"Build a REST API"}'
# Run it (returns SSE stream)
curl -N -X POST http://localhost:18800/api/tasks/{task_id}/run \
-H "Content-Type: application/json" \
-d '{"task_id":"...","message":"Build a REST API for a todo app","project_path":"/path/to/folder"}'
Conversations (Chat)
| Method | Endpoint | Description |
|---|---|---|
GET | /api/conversations | List all conversations |
POST | /api/conversations | Create a new conversation |
DELETE | /api/conversations/{id} | Delete a conversation |
PUT | /api/conversations/{id}/title | Update conversation title |
GET | /api/conversations/{id}/messages | Get conversation messages |
POST | /api/conversations/{id}/messages | Add a message |
POST | /api/chat/send | Send chat message (SSE stream) |
POST | /api/chat/enhanced | Send chat with tools (SSE stream) |
Teams
| Method | Endpoint | Description |
|---|---|---|
GET | /api/teams | List all teams |
POST | /api/teams | Create a team |
PUT | /api/teams/{id} | Update a team |
DELETE | /api/teams/{id} | Delete a team |
POST | /api/teams/run | Run a team task |
Automation Workflows
| Method | Endpoint | Description |
|---|---|---|
GET | /api/workflows | List all workflows |
POST | /api/workflows | Create a workflow |
GET | /api/workflows/{id} | Get workflow details |
PUT | /api/workflows/{id} | Update a workflow |
DELETE | /api/workflows/{id} | Delete a workflow |
POST | /api/workflows/{id}/run | Execute a workflow |
GET | /api/workflow/runs/{run_id} | Get workflow run status |
POST | /api/workflow/approvals/{id}/resolve | Resolve a human-in-the-loop approval |
Schedules
| Method | Endpoint | Description |
|---|---|---|
GET | /api/schedules | List all scheduled tasks |
POST | /api/schedules | Create a schedule |
DELETE | /api/schedules/{id} | Delete a schedule |
POST | /api/schedules/{id}/toggle | Enable/disable a schedule |
Listeners
| Method | Endpoint | Description |
|---|---|---|
GET | /api/listeners | List all listeners |
POST | /api/listeners | Create a listener |
PUT | /api/listeners/{id} | Update a listener |
DELETE | /api/listeners/{id} | Delete a listener |
POST | /api/listeners/{id}/start | Start a listener |
POST | /api/listeners/{id}/stop | Stop a listener |
POST | /api/listeners/{id}/toggle | Toggle listener on/off |
GET | /api/listeners/{id}/logs | Get listener execution logs |
GET | /api/listeners/platforms | Get supported messaging platforms |
Webhooks
| Method | Endpoint | Description |
|---|---|---|
GET | /api/webhooks | List all webhooks |
POST | /api/webhooks | Create a webhook |
DELETE | /api/webhooks/{id} | Delete a webhook |
POST | /api/webhooks/{id}/toggle | Enable/disable a webhook |
GET | /api/webhooks/{id}/url | Get the webhook's trigger URL |
GET | /api/webhooks/{id}/logs | Get webhook execution logs |
Billing & Usage
| Method | Endpoint | Description |
|---|---|---|
GET | /api/billing/summary | Get spend summary per provider |
GET | /api/billing/limits | Get all spend limits |
POST | /api/billing/limits | Set a spend limit for a provider |
GET | /api/billing/fallback-order | Get provider fallback priority |
POST | /api/billing/fallback-order | Set provider fallback priority |
GET | /api/billing/pricing | Get model pricing table |
POST | /api/billing/pricing | Update model pricing |
POST | /api/billing/reset | Reset all usage data |
GET | /api/billing/daily | Get daily usage (last 30 days) |
GET | /api/billing/daily-by-model | Get daily usage per model |
Settings & Data Management
| Method | Endpoint | Description |
|---|---|---|
GET | /api/settings | Get all settings |
PUT | /api/settings | Update settings |
POST | /api/settings/test | Test provider connection |
GET | /api/platform | Get platform info (OS, version) |
GET | /api/diagnostics | Run system diagnostics |
POST | /api/inference/stop | Stop running inference (optional task_id in body) |
POST | /api/data/clear-history | Clear all conversation history |
POST | /api/data/reset-keys | Reset all API keys |
POST | /api/data/reset-database | Reset entire database |
POST | /api/data/reset-all | Factory reset |
Streaming (Server-Sent Events)
Several endpoints return SSE streams for real-time updates. Connect with EventSource or curl -N.
| Endpoint | Description |
|---|---|
POST /api/tasks/{id}/run | Agent task execution — streams text, tool calls, plan steps, done/error |
POST /api/tasks/{id}/resume | Resume interrupted task — same event types as run |
POST /api/chat/send | Chat message — streams text chunks + done |
POST /api/chat/enhanced | Chat with tools — streams text + tool events + done |
GET /api/events | Global event stream — receives ALL workflow events from any client (desktop or web) |
SSE Event Types:
{"type":"text","content":"Hello world..."} // Streaming text
{"type":"tool_start","tool":"write_file","input":{}} // Tool execution started
{"type":"tool_end","tool":"write_file","success":true} // Tool completed
{"type":"node_running","node_id":"role_0","label":"PM"} // Workflow node status
{"type":"done","total_turns":5} // Task completed
{"type":"error","message":"..."} // Task failed
Authentication
API authentication is optional. When a token is configured in Settings > Web Server, include it as a Bearer token:
curl -H "Authorization: Bearer YOUR_TOKEN" http://localhost:18800/api/tasks
If no token is configured, all API requests are allowed without authentication (suitable for local-only access).
Check authentication status:
GET /api/auth/check
# Returns: {"authenticated": true}
OpenAI API Compatibility
Aura Workshop can connect to any OpenAI-compatible API endpoint. This includes OpenAI itself, DeepSeek, Moonshot/Kimi, vLLM, TGI, SGLang, LocalAI, LiteLLM, and any server implementing the chat completions format.
Configuration
In Settings, select a provider or enter a custom base URL:
| Provider | Base URL | Auth Header |
|---|---|---|
| OpenAI | https://api.openai.com/v1 | Authorization: Bearer sk-... |
| DeepSeek | https://api.deepseek.com | Authorization: Bearer sk-... |
| Moonshot | https://api.moonshot.cn/v1 | Authorization: Bearer sk-... |
| Custom / Self-hosted | http://your-server:8000/v1 | Optional |
Supported Endpoints
POST /v1/chat/completions # Standard chat completions
POST /chat/completions # Also accepted (without /v1 prefix)
Request Format
{
"model": "deepseek-chat",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"max_tokens": 4096,
"temperature": 0.7,
"stream": true,
"tools": [...] // Optional: function calling
}
Streaming Response
data: {"choices":[{"delta":{"content":"Hello"},"index":0}]}
data: {"choices":[{"delta":{"content":" there"},"index":0}]}
data: {"choices":[{"delta":{"tool_calls":[...]},"index":0}]}
data: [DONE]
Anthropic API
Aura Workshop natively supports the Anthropic Messages API for Claude models.
Configuration
| Setting | Value |
|---|---|
| Base URL | https://api.anthropic.com |
| Auth Header | x-api-key: sk-ant-... |
| API Version | anthropic-version: 2023-06-01 |
Supported Models
claude-opus-4-20250514— Most capable, highest costclaude-sonnet-4-20250514— Best balance of capability and costclaude-haiku-4-20250514— Fastest, lowest cost
Request Format
{
"model": "claude-sonnet-4-20250514",
"max_tokens": 4096,
"messages": [
{"role": "user", "content": "Hello!"}
],
"system": "You are a helpful assistant.",
"tools": [...] // Optional: tool use
}
Response Format
{
"content": [{"type": "text", "text": "Hello! How can I help?"}],
"model": "claude-sonnet-4-20250514",
"usage": {
"input_tokens": 12,
"output_tokens": 8
}
}
Anthropic responses include exact token counts in the usage field, which Aura uses for precise billing tracking.
Ollama API
Aura Workshop connects to Ollama via its OpenAI-compatible endpoint. Run any local model through Ollama and Aura treats it like any other provider.
Configuration
| Setting | Value |
|---|---|
| Base URL | http://localhost:11434/v1 |
| API Key | Leave empty |
| Model | Any Ollama model name (e.g., llama3.1, qwen2.5, deepseek-r1) |
Setup
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull a model
ollama pull llama3.1
# Ollama runs automatically on localhost:11434
# Aura Workshop auto-detects it
Supported Models
Any model available in the Ollama registry works. Popular choices:
llama3.1/llama3.1:70b— Meta's Llama 3.1qwen2.5/qwen2.5:32b— Alibaba's Qwendeepseek-r1— DeepSeek reasoning modelcodellama— Code-specialized modelmistral/mixtral— Mistral AI modelsphi3— Microsoft's small but capable model
Ollama models run entirely locally with zero API costs. They serve as the final fallback in the provider fallback chain when all cloud provider spend limits are reached.