Skip to Content
👋 Welcome to HowToUseMoltbot Quick Start
ToolsTools Overview

Tools Overview

Moltbot ships with a powerful set of built-in tools that give your agents real capabilities—from controlling browsers to running shell commands, searching the web, and managing sessions. Unlike the older moltbot-* skills that required shelling out, these are first-class typed tools that work directly through the agent runtime.

What’s available

The core toolset includes:

  • Browser automation: Drive Chrome/Brave with snapshots, clicks, and screenshots
  • Shell execution: Run commands with background support and approval gates
  • Web access: Search (Brave/Perplexity) and fetch pages with content extraction
  • Session management: List, inspect, and send messages across agent sessions
  • Node control: Target paired macOS/iOS devices for notifications, camera, and screen capture
  • Scheduling: Manage cron jobs and wakeup events
  • Gateway operations: Restart the gateway or apply config updates in-place
  • Canvas: Drive the visual canvas (macOS node only)
  • Image analysis: Analyze images with your configured vision model

Plus optional plugin tools like Lobster (workflow engine) and LLM Task (structured JSON outputs).

Controlling access

Global allow/deny

The simplest way to restrict tools is with tools.allow and tools.deny in your config. Deny always wins.

{ "tools": { "deny": ["browser", "exec"] } }

You can use * wildcards—"*" means all tools. Matching is case-insensitive.

Tool profiles

Instead of listing every tool, use a profile as your baseline:

  • minimal: Only session_status (read-only session info)
  • coding: File tools + runtime + sessions + memory + image analysis
  • messaging: Message sending + basic session tools
  • full: Everything (same as not setting a profile)

Override per agent with agents.list[].tools.profile.

Example—messaging by default, but allow Slack and Discord tools too:

{ "tools": { "profile": "messaging", "allow": ["slack", "discord"] } }

Example—coding profile everywhere except deny shell execution:

{ "tools": { "profile": "coding", "deny": ["group:runtime"] } }

Per-provider restrictions

Sometimes you want tighter controls for specific model providers. Use tools.byProvider to narrow the tool set based on the provider or even a specific model.

This applies after the base profile but before your global allow/deny lists.

{ "tools": { "profile": "coding", "byProvider": { "google-antigravity": { "profile": "minimal" } } } }

You can even target a single model:

{ "tools": { "byProvider": { "openai/gpt-5.2": { "allow": ["group:fs", "sessions_list"] } } } }

Tool groups (shortcuts)

Instead of listing tools one by one, use groups:

  • group:runtimeexec, bash, process
  • group:fsread, write, edit, apply_patch
  • group:sessionssessions_list, sessions_history, sessions_send, sessions_spawn, session_status
  • group:memorymemory_search, memory_get
  • group:webweb_search, web_fetch
  • group:uibrowser, canvas
  • group:automationcron, gateway
  • group:messagingmessage
  • group:nodesnodes
  • group:moltbot → all built-in tools (excludes plugin tools)

Example:

{ "tools": { "allow": ["group:fs", "browser", "web_search"] } }

Plugin tools

Plugins can add their own tools beyond the built-in set. See Plugins for installation and Skills for how tool instructions get injected into prompts.

Notable plugin tools:

  • Lobster: Workflow runtime with typed pipelines and approval gates
  • LLM Task: Run structured JSON-only LLM tasks (great for workflow steps)

If your allowlist only references plugin tool names that aren’t loaded, Moltbot logs a warning and keeps core tools available (so you don’t accidentally lock yourself out).

Tool reference

exec

Run shell commands in the workspace. Supports foreground and background execution.

Key parameters:

  • command (required)
  • yieldMs (auto-background after this delay, default 10s)
  • background (start in background immediately)
  • timeout (kill if it runs longer than this, default 30 minutes)
  • host (sandbox | gateway | node)—where to execute
  • security (deny | allowlist | full)—enforcement mode
  • ask (off | on-miss | always)—approval prompts
  • elevated (run on gateway host with full security when allowed)
  • pty (run in a pseudo-terminal for interactive CLIs)

Use the process tool to poll, send input, or kill background sessions.

When sandboxed, exec runs in Docker by default. Set host=gateway or elevated=true to run on the host (requires approval gates unless you explicitly allow it).

See Exec tool and Exec approvals for the full story.

browser

Control a dedicated Chrome/Brave profile (isolated from your personal browser).

Common actions:

  • status, start, stop—manage the browser process
  • tabs, open, focus, close—tab control
  • snapshot—get a text representation of the page (AI or ARIA format)
  • screenshot—capture pixels (full page or specific elements)
  • act—click, type, hover, drag, select, fill forms
  • navigate, console, pdf, upload, dialog—advanced operations

Multi-profile support: Use profile to target named browser configs (e.g., clawd, work, chrome). Profiles can point to local managed browsers or remote CDP endpoints.

Targeting:

  • target=sandbox—browser in Docker (requires sandboxing enabled)
  • target=host—your gateway machine’s browser
  • target=node—browser on a paired macOS/Linux node

Snapshots return ref IDs (numeric like 12 or role-based like e12) that you use with act to click or type. No brittle CSS selectors.

See Browser tool for setup, profiles, and the Chrome extension relay.

Search the web using Brave Search API (default) or Perplexity Sonar.

Parameters:

  • query (required)
  • count (1–10 results)
  • country, search_lang, ui_lang, freshness (optional filters)

Setup: Get a Brave API key from brave.com/search/api , then run moltbot configure --section web or set BRAVE_API_KEY.

Results are cached for 15 minutes by default.

See Web tools for Perplexity setup and configuration.

web_fetch

Fetch a URL and extract readable content (HTML → markdown or plain text).

Parameters:

  • url (required)
  • extractMode (markdown | text)
  • maxChars (truncate long pages)

Uses Readability extraction by default, with optional Firecrawl fallback for JS-heavy or bot-protected sites.

See Web tools and Firecrawl.

process

Manage background exec sessions.

Actions:

  • list—show active/recent sessions
  • poll—check for new output (returns exit code when done)
  • log—read output with offset/limit for line-based paging
  • write, send-keys, submit, paste—send input
  • kill, clear, remove—stop or clean up sessions

Background sessions are scoped per agent. One agent can’t see another’s sessions.

apply_patch

Apply structured multi-file edits in a single call. Useful when a normal edit tool would be too fragile.

Format:

*** Begin Patch *** Add File: path/to/new.txt +line content *** Update File: src/app.ts @@ -old line +new line *** Delete File: obsolete.txt *** End Patch

Experimental. Enable with tools.exec.applyPatch.enabled (OpenAI models only).

See Apply Patch.

sessions_list / sessions_history / sessions_send / sessions_spawn

Work with agent sessions across conversations.

sessions_list: List active sessions with optional message previews
sessions_history: Read transcript for a session (by key or ID)
sessions_send: Send a message to another session and optionally wait for a reply
sessions_spawn: Start a sub-agent run in the background (gets announced back when done)
session_status: Show current session info or override the model

Useful for multi-agent setups, delegation, or checking what happened in a different conversation.

See Session tool and Subagents.

agents_list

List which agent IDs the current session can target with sessions_spawn. Respects per-agent allowlists (agents.list[].subagents.allowAgents).

message

Send messages and perform channel-specific actions (reactions, edits, polls, threads, etc.) across Discord, Telegram, Slack, WhatsApp, Google Chat, Signal, iMessage, and MS Teams.

Common actions:

  • send—text + optional media (images, files, locations)
  • react, edit, delete, pin
  • poll—create polls (WhatsApp, Discord, MS Teams)
  • thread-create, thread-reply—thread management
  • search, permissions, member-info, role-info
  • Moderation: timeout, kick, ban

Channel-specific features like Adaptive Cards (Teams), stickers (Telegram), and emoji uploads work through the same tool.

cron

Manage scheduled jobs on the gateway.

Actions:

  • list, status—view jobs and recent runs
  • add, update, remove—CRUD operations
  • run—trigger a job immediately
  • wake—enqueue a system event (optionally with immediate heartbeat)

Jobs use standard cron syntax and can run shell commands or trigger agent messages.

See Cron CLI.

gateway

Restart the gateway or apply configuration updates without stopping.

Actions:

  • restart—send SIGUSR1 for in-process restart
  • config.get, config.schema—inspect current config
  • config.apply, config.patch—validate and write config, then restart
  • update.run—apply updates and restart

Useful for agents that manage their own infrastructure.

Restart requires commands.restart: true in config.

nodes

Discover and control paired macOS/iOS devices.

Actions:

  • status, describe—see what’s connected and what capabilities are available
  • pending, approve, reject—manage pairing requests
  • notify—send macOS notifications
  • run—execute commands on the node (requires approval gates)
  • camera_snap, camera_clip, screen_record—capture media
  • location_get—GPS coordinates (iOS/macOS with location permissions)

Images come back as media blocks. Videos return file paths. The node app must be foregrounded for camera/screen capture.

See Nodes CLI.

canvas

Drive the visual canvas on macOS nodes.

Actions:

  • present, hide, navigate—show/hide/control the canvas window
  • eval—run JavaScript in the canvas context
  • snapshot—capture a screenshot
  • a2ui_push, a2ui_reset—A2UI rendering (v0.8 format)

Uses node.invoke under the hood. Auto-selects a node if only one is connected.

image

Analyze an image with your configured vision model.

Parameters:

  • image (path or URL)
  • prompt (optional, defaults to “Describe the image.”)
  • model (optional override)
  • maxBytesMb (size cap)

Only available when you have agents.defaults.imageModel configured or when Moltbot can infer an image model from your main model + auth profiles.

Common patterns

Browser automation:

  1. browser status / browser start
  2. browser snapshot to get the page structure
  3. browser act with a ref from the snapshot to click/type
  4. browser screenshot for visual confirmation

Shell tasks:

  1. exec with a command
  2. If it backgrounds, poll with process until done
  3. Read output with process log

Multi-agent delegation:

  1. sessions_spawn to kick off a sub-agent task
  2. It runs in the background and announces back when finished
  3. Use sessions_history to inspect what happened

Node capture:

  1. nodes status to see what’s connected
  2. nodes camera_snap or nodes screen_record
  3. Results come back as media blocks or file paths

Safety notes

  • exec and nodes run are powerful. Use approval gates (ask: "on-miss" or "always") and allowlists when running on real hosts.
  • Camera/screen capture requires consent. Always check permissions with nodes describe first.
  • Elevated mode (elevated: true or host: "gateway" with security: "full") bypasses some safety checks—reserve it for trusted agents.
  • Browser isolation: The clawd browser profile is separate from your personal browser, but it can still access logged-in sessions. Treat it carefully.

See Security and Sandboxing for the full picture.

How tools work under the hood

When an agent runs, Moltbot exposes tools in two ways:

  1. System prompt: Human-readable descriptions so the model knows what’s available
  2. Tool schemas: Typed function definitions sent to the model API (OpenAI function calling, Anthropic tools, etc.)

If a tool isn’t in both places, the model can’t see it. That’s why allow/deny lists are enforced before the prompt is built—disallowed tools never make it to the model.

Last updated on: