Computer - Noorle

Computer is an Agent-only capability. It cannot be attached to MCP gateways due to its sensitive nature. Only agents can access computer control.

Computer capability gives agents full programmatic control of a desktop environment. Useful for complex tasks requiring visual feedback and interactive control.

Key Features

Screenshots - Capture current screen state
Mouse Control - Move, click, drag operations
Keyboard Input - Type text, press keys
Screen Navigation - Scroll, zoom, multi-window
Real-time Feedback - Visual loop with agent decisions
State Tracking - Remember screen positions

How to Enable

For Agents Only

Agents > Select Agent > Settings > Capabilities
Search for Computer
Click Attach
Save

Computer cannot be attached to MCP gateways. It’s agent-exclusive for security reasons.

Usage Examples

Take Screenshot

"Take a screenshot of the current desktop"

Returns PNG image of desktop with dimensions and visible elements.

Click on Element

"Click the blue button in the top-right corner"

Agent:

Analyzes screenshot
Identifies button coordinates
Executes click
Captures new screenshot

Fill Form

"Fill the login form with username 'alice' and password 'secret'"

Agent:

Screenshots form
Identifies input fields
Clicks on username field
Types username
Clicks on password field
Types password
Clicks submit button

Navigate Multi-step Process

"Open the application, go to settings, and change the theme to dark"

Agent works through steps with visual feedback.

Screen Coordinates

Agent receives screen coordinates for all elements:

Screenshot resolution: 1920x1080
Button position: x=1850, y=20

Agent can:

Click at coordinates
Drag between points
Identify text positions
Calculate relative positions

Interaction Types

Mouse Actions

click(x, y) - Single click
double_click(x, y) - Double click
right_click(x, y) - Right/context click
drag(x1, y1, x2, y2) - Drag from point to point
move(x, y) - Move cursor without clicking
scroll(direction, amount) - Scroll up/down/left/right

Keyboard Actions

type(text) - Type text string
key(name) - Press single key (Enter, Tab, Escape, etc.)
hotkey(mod, key) - Keyboard shortcut (Ctrl+C, Cmd+V, etc.)

screenshot() - Capture current screen
wait(seconds) - Wait for page to load
maximize() - Maximize window
minimize() - Minimize window

Size Tiers

Each Computer instance is a dedicated virtual machine. Choose a size based on your workload:

Size	vCPUs	RAM	Disk
x2	2	2 GB	40 GB
x4 (default)	3	4 GB	80 GB
x8	4	8 GB	160 GB
x16	8	16 GB	240 GB
x32	16	32 GB	360 GB

Supported Operating Systems

Ubuntu 24.04, Ubuntu 22.04
Debian 12, Debian 11

Configuration

Optional agent specifications:

{
  "computer": {
    "shell_enabled": true,
    "browser_enabled": false,
    "browser_max_tabs": 5,
    "browser_max_download_mb": 50,
    "browser_allowed_domains": []
  }
}

Setting	Default	Effect
`shell_enabled`	true	Enable shell command execution
`browser_enabled`	false	Enable browser subsystem
`browser_max_tabs`	5	Maximum concurrent browser tabs
`browser_max_download_mb`	50	Maximum download size (MB)
`browser_allowed_domains`	[]	Domain allowlist (empty = all allowed)

Browser Subsystem

Computer includes an optional stateful browser that persists sessions, cookies, and navigation state across tool calls. This is disabled by default — set browser_enabled: true to activate it.

Browser Tools

When the browser subsystem is enabled, the agent gains these tools:

Tool	Purpose
`browser_navigate`	Navigate to a URL
`browser_snapshot`	Get the current page’s accessibility tree
`browser_act`	Interact with page elements (click, type, select)
`browser_screenshot`	Capture a screenshot of the current page
`browser_pdf`	Generate a PDF of the current page
`browser_tabs`	List open browser tabs
`browser_close`	Close a browser tab

Stateful vs Stateless Browser

Key difference: The Computer browser subsystem maintains state (cookies, login sessions, tabs) across calls. The standalone Browser capability is stateless — each call starts fresh.

Feature	Computer Browser (stateful)	Browser Capability (stateless)
Login persistence	Stays logged in across calls	Each call is a fresh session
Multi-step workflows	Navigate across pages, fill multi-step forms	Single-page operations only
Tabs	Multiple tabs, switch between them	No tab management
Domain control	Allowlist specific domains	No domain restrictions
Availability	Agent-only	Agents and MCP gateways

Domain Allowlist

Use browser_allowed_domains to restrict which sites the browser can visit. An empty list (default) allows all domains. When set, navigation to domains not in the list is blocked.

{
  "browser_allowed_domains": ["example.com", "app.internal.com"]
}

Resource Limits

Limit	Value	Notes
Default SSH Timeout	30 seconds	Per command execution
Browser Snapshot	500 elements max	Truncated if exceeded
Browser Max Tabs	5	Per session
Browser Max Download	50 MB	Per file

Cost

For current pricing details, see Pricing. Monitor in Account > Usage dashboard.

Common Use Cases

Web Application Testing

Screenshot app, verify buttons, test workflows

Automation

Automate repetitive UI tasks programmatically

Data Entry

Fill forms and navigate multi-step processes

Visual Inspection

Verify visual appearance matches requirements

Agent Loop Pattern

Typical agent workflow:

Screenshot - See current state
Analyze - LLM processes image
Decide - LLM decides next action
Execute - Perform mouse/keyboard action
Repeat - Loop until task complete

Each iteration includes LLM context (screenshot analysis), so agent sees results of actions.

Best Practices

Start with Screenshot

Always capture initial state before taking actions.

Be Explicit

Use clear instructions for agent:

✓ "Click the 'Save' button (blue, bottom-right)"
✗ "Save the file"

Handle Errors

If action doesn’t work as expected:

"Screenshot again to verify action completed"

Use Coordinates When Possible

Provide coordinates directly when known:

"Click at coordinates (1850, 50)"

Wait for State Changes

Allow time for UI updates:

"Wait 2 seconds for dialog to load"
"Take screenshot to verify"

Limitations

Desktop/Web only - Works with rendered interfaces
Not for APIs - Use HTTP Client for APIs
Visual interpretation - Relies on screenshot analysis
Speed - Slower than direct API calls
Flakiness - UI changes can break workflows

When NOT to Use

Task	Use Instead
API access	HTTP Client
Quick calculations	Code Runner
File operations	Files
Database access	HTTP Client (via REST)

Troubleshooting

Screenshot is blank

Wait for page to load
Check window is focused
Verify viewport size is correct

Click doesn’t work

Coordinates may be off
Element may not be clickable
Try right-clicking instead
Screenshot again to verify state

Text not entered

Field may not be focused
Type more slowly
Use keyboard navigation (Tab)
Copy-paste if typing fails

Agent stuck in loop

Break task into smaller steps
Increase wait times
Provide more explicit instructions
Use timeout to stop execution

Privacy & Security

Computer capability has broad system access. Only use with trusted tasks.

Screenshots may contain sensitive data
Keyboard input includes all characters
No automatic filtering of credentials
Use with caution in production

Best practices:

Use dedicated user accounts
Limit to non-sensitive applications
Monitor screen capture content
Disable in production where possible

API Access

# Execute computer action
curl -X POST https://api.noorle.com/v1/agents/{agent_id}/computer \
  -H "X-API-Key: ak-{your_key}" \
  -H "Content-Type: application/json" \
  -d '{
    "action": "screenshot"
  }'

# Click action
curl -X POST https://api.noorle.com/v1/agents/{agent_id}/computer \
  -H "X-API-Key: ak-{your_key}" \
  -H "Content-Type: application/json" \
  -d '{
    "action": "click",
    "x": 100,
    "y": 100
  }'

Next Steps

Browser Capability — Stateless browser for quick page rendering
Web Search - Find Information
HTTP Client - API Access
Code Runner - Process Data
Creating Agents

Getting Started

Platform

Built-in Capabilities

Connectors

Agents

Channels

Workflows

Knowledge & RAG

Documentation Index

​Key Features

​How to Enable

​For Agents Only

​Usage Examples

​Take Screenshot

​Click on Element

​Fill Form

​Navigate Multi-step Process

​Screen Coordinates

​Interaction Types

​Mouse Actions

​Keyboard Actions

​Navigation

​Size Tiers

​Supported Operating Systems

​Configuration

​Browser Subsystem

​Browser Tools

​Stateful vs Stateless Browser

​Domain Allowlist

​Resource Limits

​Cost

​Common Use Cases

​Web Application Testing

​Automation

​Data Entry

​Visual Inspection

​Agent Loop Pattern

​Best Practices

​Start with Screenshot

​Be Explicit

​Handle Errors

​Use Coordinates When Possible

​Wait for State Changes

​Limitations

​When NOT to Use

​Troubleshooting

​Screenshot is blank

​Click doesn’t work

​Text not entered

​Agent stuck in loop

​Privacy & Security

​API Access

​Next Steps

Key Features

How to Enable

For Agents Only

Usage Examples

Take Screenshot

Click on Element

Fill Form

Navigate Multi-step Process

Screen Coordinates

Interaction Types

Mouse Actions

Keyboard Actions

Navigation

Size Tiers

Supported Operating Systems

Configuration

Browser Subsystem

Browser Tools

Stateful vs Stateless Browser

Domain Allowlist

Resource Limits

Cost

Common Use Cases

Web Application Testing

Automation

Data Entry

Visual Inspection

Agent Loop Pattern

Best Practices

Start with Screenshot

Be Explicit

Handle Errors

Use Coordinates When Possible

Wait for State Changes

Limitations

When NOT to Use

Troubleshooting

Screenshot is blank

Click doesn’t work

Text not entered

Agent stuck in loop

Privacy & Security

API Access

Next Steps